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Description 

The present invention relates to the field of molecular biology. In particular, it relates to. among other things nu- 
cleotide sequences of Staphylococcus aureus, contigs, ORFs, fragments, probes, primers and related polynucleotides 
5 thereof, peptides and polypeptides encoded by the sequences and uses of the polynucleotides and sequences thereof 
such as in fermentation, polypeptide production, assays and pharmaceutical development, among others 

The genus Staphylococcus includes at least 20 distinct species (For a review see Novick. R P. The Staphyloco- 
ccus as a Molecular Genetic System. Chapter 1 pgs 1-37 in MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, 
R Novick. Ed . VCH Publishers. New York (1 990)) Species differ from one another by 80% or more, by hybridization 
w kinetics whereas strains within a species are at least 90% identical by the same measure 

The species Staphylococcus aureus, a gram-positive facultatively aerobic, clump-forming cocci, ts among the 
most important etiological agents of bacterial infection in humans, as discussed briefly below 

Human Health and S. Aureus 

15 

Staphylococcus aureus is a ubiquitous pathogen (See. for instance Mims ct a/.. MEDICAL MICROBIOLOGY 
Mosby-Year Book Europe Limited, London UK (1993)). It is an etiological agent of a variety of conditions ranging m 
severity from mild to fatal A few of the more common conditions caused by S aureus infection are burns cellulitis, 
eyelid infections, food poisoning, joint infections neonatal conjunctivitis. osteomyelitis, skin infections, surgical wound 
20 infection, scalded skin syndrome and toxic shock syndrome, some of which are described further below 

Burns 

Burn wounds generally are sterile initially However they generally compromise physical and immune barriers to 
25 infection, cause loss of fluid and electrolytes and result in local or general physiological dysfunction After cooling, 
contact with viable bacteria results in mixed colonization at the injury site Infection may be restricted to the non-viable 
debris on the burn surface ("eschar"), it may progress into full skin infection and invade viable tissue below the eschar 
and it may reach below the skin, enter the lymphatic and blood circulation and develop into septicaemia S aureus is 
among the most important pathogens typically found in burn wound infections It can destroy granulation tissue and 
30 produce severe septicaemia 

Cellulitis 

Cellulitis, an acute infection of the skin that expands from a typically superficial origin to spread below the cutaneous 
3$ layer, most commonly is caused by S aureus in conjunction with S pyrogenes Cellulitis can lead to systemic infection 
In fact cellulitis can be one aspect of synergistic bacterial gangrene This condition typically is caused by a mixture of 
S aureus and microaerophilic streptococci It causes necrosis and treatment is limited to excision of the necrotic tissue 
The condition often is fatal 

•40 Eyelid infections 

S. aureus is the cause of styes and of sticky eye' in neonates, among other eye infections Typically such infections 
are limited to the surface of the eye, and may occasionally penetrate the surface with more severe consequences. 

■*5 Food poisoning 

Some strains of S aureus produce one or more of five serologically distinct, heat and acid stable enterotoxms that 
are not destroyed by digestive process of the stomach and small intestine (enterotoxms A-E) Ingestion of the toxin, 
in sufficient quantities, typically results in severe vomiting, but not diarrhoea The effect does not require viable bacteria 
so Although the toxins are known, their mechanism of action is not understood 

Joint infections 

S aureus infects bone joints causing diseases such osteomyelitis 

55 

Osteomyelitis 

S aureus is the most common causative agent of haematogenous osteomyelitis The disease tends to occur in 
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children and adolescents more than adults and it is associated with non-penetrating injuries to bones. Infection typically 
occurs in the long end of growing bone, hence its occurrence in physically immature populations Most often, infection 
is localized in the vicinity of sprouting capillary loops adjacent to epiphysial growth plates in the end of long, growing 
bones 

5 

Skin infections 

S aureus is the most common pathogen of such minor skin infections as abscesses and boils Such infections 
often are resolved by normal host response mechanisms, but they also can develop into severe internal infections 
io Recurrent infections of the nasal passages plague nasal carriers of S aureus 

Surgical Wound Infections 

Surgical wounds often penetrate far into the body Infection of such wound thus poses a grave risk to the patient 
S. aureus is the most important causative agent of infections in surgical wounds S aureus is unusually adept at 
invading surgical wounds sutured wounds can be infected by far fewer S aureus cells then are necessary to cause 
infection in normal skin Invasion of surgical wound can lead to severe S aureus septicaemia Invasion of the blood 
stream by S. aureus can lead to seeding and infection of internal organs, particularly heart valves and bone causing 
systemic diseases, such as endocarditis and osteomyelitis 

20 

Scalded Skin Syndrome 

S aureus is responsible for "scalded skin syndrome" (also called toxic epidermal necrosis. Ritter's disease and 
Lyell's disease) This diseases occurs in older children, typically in outbreaks caused by flowering of S. aureus strains 
2S produce exfoliation(also called scalded skin syndrome toxin) Although the bacteria initially may infect only a minor 
lesion, the toxin destroys intercellular connections, spreads epidermal layers and allows the infection to penetrate the 
outer layer of the skin, producing the desquamation that typifies the diseases Shedding of the outer layer of skin 
generally reveals normal skin below, but fluid lost in the process can produce severe injury in young children if it is not 
treated properly 

30 

Toxic Shock Syndrome 

Toxic shock syndrome is caused by strains of S aureus that produce the so-called toxic shock syndrome toxin 
The disease can be caused by S aureus infection at any site but it is too often erroneously viewed exclusively as a 
35 disease solely of women who use tampons The disease involves toxaemia and septicaemia, and can be fatal 

Nocosomial Infections 

In the 1984 National Nocosomial Infection Surveillance Study ("NNIS") S aureus was the most prevalent agent 
■*o of surgical wound infections in many hospital services, including medicine, surgery, obstetrics, pediatrics and newborns 

Resistance to drugs of S. aureus strains 

Prior to the introduction of penicillin the prognosis for patients seriously infected with S aureus was unfavorable 
Following the introduction of penicillin in the early 1 940s even the worst S aureus infections generally could be treated 
successfully The emergence of penicillin-resistant strains of S aureus did not take long, however Most strains of S 
aureus encountered in hospital infections today do not respond to penicillin; although fortunately, this is not the case 
for S aureus encountered in community infections 

It is well known now that penicillin-resistant strains of S aureus produce a lactamase which converts penicillin to 
50 pencillmoic acid, and thereby destroys antibiotic activity Furthermore the lactamase gene often is propagated episo- 
mally typically on a plasmid, and often is only one of several genes on an episomal element that together confer 
multidrug resistance 

Methicillms, introduced in the 1960s largely overcame the problem of penicillin resistance in S aureus These 
compounds conserve the portions of penicillin responsible for antibiotic activity and modify or alter other portions that 
55 make penicillin a good substrate for inactivating lactamases However, methicillin resistance has emerged in S aureus. 
along with resistance to many other antibiotics effective against this organism, including aminoglycosides tetracycline 
chloramphenicol, macrolidos and lincosamides In fact, methicillin-resistant strains of S aureus generally are multiply 
drug resistant 
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The molecular genetics of most types of drug resistance in S aureus has been elucidated (See Lyon et a! . Micro- 
biology Rc\ /ews5V 83-1 34 ( 1 937)) Generally, resistance is mediated by plasmids, as noted above regarding penicillin 
resistance, however several stable forms of drug resistance have been observed that apparently involve integration 
of a resistance element into the S aureus genome itself. 
s Thus far each new antibiotic gives rise to resistance strains, stains emerge that are resistance to multiple drugs 

and increasingly persistenl forms of resistance begin to emerge Drug resistance of S aureus infections already poses 
significant treatment difficulties, which are likely to get much worse unless new therapeutic agents are developed 

Molecular Genetics of Staphylococcus Aureus 

w 

Despite its importance in among other things, human disease, relatively little is known about the genome of this 
organism 

Most genetic studies of S aureus have been carried out using the the strain NCTC&325, which contains prophages 
psi11 psi12 and psi 1 3 and the UV-curcd derivative of this strain 8325-4 (also referred to as RN450). which is free of 
is the prophages 

These studies revealed that the S. aureus genome, like that of other staphylococci consists of one circular cov- 
alcntly closed double-stranded DNA and a collection of so-called variable accessory genetic elements, such as 
prophages, plasmids. transposons and the like. 

Physical characterization of the genome has not been carried out in any detail Pattee et ai published a low res- 

20 olution and incomplete genetic and physical map of the chromosome of S aureus strain NCTC 8325 (Pattee ct ai 
Genetic and Physical Mapping of Chromosome of Staphylococcus aureus NCTC 8325. Chapter 11 pgs. 163-169 in 
MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, R P Novick, Ed. VCH Publishers. New York, (1990) The genetic 
map largely was produced by mapping insertions of Tn551 and Tn4001. which, respectively, confer erythromycin and 
gentamicin resistance, and by analysis of Smal-digested DNA by Pulsed Field Gel Electrophoresis ("PFGE"). 

25 The map was of low resolution; even estimating the physical size of the genome was difficult, according to the 

investigators The si/e of the largest Smal chromosome fragment, for instance, was too large for accurate sizing by 
PFGE To estimate its size, additional restriction sites had to be introduced into the chromosome using a transposon 
containing a Smal recognition sequence 

tn sum, most physical characteristics and almost all of the genes of Staphylococcus aureus are unknown Among 

30 the few genes that have been identified, most have not been physically mapped or characterized in detail Only a very 
few genes of this organism have been sequenced (See, for instance Thornsberry J , Antimicrobial Chemotherapy 2A_ 
Suppl C : 9-16 (1988), current versions of GENBANK and other nucleic acid databases, and references that relate to 
the genome of S aureus such as those set out elsewhere herein ) 

It is clear that the etiology of diseases mediated or exacerbated by S aureus infection involves the programmed 

35 expression of S aureus genes, and that characterizing the genes and their patterns of expression would add dramat- 
ically to our understanding of the organism and its host interactions Knowledge of S aureus genes and genomic 
organization would dramatically improve understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases Moreover characterized genes and genomic fragments of 
S. ,3 u reus would provide reagents for among other things, detecting, characterizing and controlling S aureus infections 

■*o There is a need therefore to characterize the genome of S. aureus and for polynucleotides and sequences of this 
organism. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome The primary 
nucleotide sequences which were generated are provided in SEQ ID NOS: 1-5,191 

The present invention provides the nucleotide sequence of several thousand contigs of the Staphylococcus aureus 
4 5 genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative 
fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan In one embod- 
iment, the present invention is provided as contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS 1-5,191 

The present invention further provides nucleotide sequences which are at least 95% preferably 99% and most 
50 preferably 99 9%, identical to the nucleotide sequences of SEQ ID NOS: 1-5 191 

The nucleotide sequence of SEO ID NOS 1-5,191 a representative fragment thereof or a nucleotide sequence 
which is at least 95%, preferably 99% and most preferably 99 9%. identical to the nucleotide sequence of SEQ ID 
NOS 1-5,191 may be provided in a variety of mediums to facilitate its use In one application of this embodiment, the 
sequences of the present invention are recorded on computer readable media Such media includes but is not limited 
55 to magnetic storage media such as floppy discs, hard disc storage medium, and magnetic tape, optical storage media 
such as CD-ROM electrical storage media such as RAM and ROM and hybrids of these categories such as magnetic/ 
optical storage media 

The present invention further provides systems particularly computer-based systems which contain the sequence 
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information herein described stored in a data storage means Such systems are designed to identify commercially 
important fragments of the Staphylococcus aureus genome 

Another embodiment of the present invention is directed to fragments, preferably isolated fragments, of the Sta- 
phylococcus aureus genome having particular structural or functional attributes. Such fragments of the Staphylococcus 
5 aureus genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter 
referred to as open reading frames or ORF's," fragments which modulate the expression of an operably linked ORR 
hereinafter referred to as expression modulating fragments or EMFs." and fragments which can be used to diagnose 
the presence of Staphylococcus aureus in a sample, hereinafter referred to as diagnostic fragments or "DFs " 

Each of the ORFs in fragments of the Staphylococcus aureus genome disclosed in Tables 1-3, and the EMFs 
to found 5' to the ORFs ; can be used in numerous ways as polynucleotide reagents For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in 
a sample to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides 
encoded by ORFs of the present invention particular those polypeptides that have a pharmacological activity 

The present invention further includes recombinant constructs comprising one or more fragments of the Staphy- 
is lococcus aureus genome of the present invention The recombinant constructs of the present invention comprise vec- 
tors, such as a plasmid or viral vector into which a fragment of the Staphylococcus aureus has been inserted 

The present invention further provides host ceils containing any of the isolated fragments of the Staphylococcus 
aureus genome of the present invention The host cells can be a higher eukaryotic host cell such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell or a procaryotic cell such as a bacterial cell 
20 The present invention is further directed to polypeptides and proteins preferably isolated polypeptides and pro- 

teins, encoded by ORFs of the present invention A variety of methods, well known to those of skill in the art. routinely 
may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and 
proteins of the present invention having relatively short, simple ammo acid sequences readily can be synthesized using 
commercially available automated peptide synthesizers Polypeptides and proteins of the present invention also may 
2S be purified from bacterial cells which naturally produce the protein Yet another alternative is to purify polypeptide and 
proteins of the present invention can from cells which have been altered to express them 

The invention further provides polypeptides, preferably isolated polypeptides, comprising Staphylococcus aureus 
epitopes and vaccine compositions comprising such polypeptides Also provided are methods for vacciniating an in- 
dividual against Staphylococcus aureus infection 
30 The invention further provides methods of obtaining homologs of the fragments of the Staphylococcus aureus 

genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention Specif- 
ically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers and techniques 
such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs 

The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention 
35 Such antibodies include both monoclonal and polyclonal antibodies 

The invention further provides hybridomas which produce the above-described antibodies A hybndoma is an 
immortalized cell line which is capable of secreting a specific monoclonal antibody 

The present invention further provides methods of identifying test samples derived from cells which express one 
of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one 
io or more of the antibodies of the present invention or one or more of the Dfs or antigens of the present invention under 
conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 
out the above-described assays. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement one or more containers 
■*$ which comprises (a) a first container comprising one of the antibodies, antigens, or one of the DFs of the present 
invention and (b) one or more other containers comprising one or more of the following wash reagents, reagents 
capable of detecting presence of bound antibodies, antigens or hybridized DFs. 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present 
50 invention Specifically, such agents include, as further described below antibodies, peptides, carbohydrates, pharma- 
ceutical agents and the like Such methods comprise steps of (a)contacting an agent with an isolated protein encoded 
by one of the ORFs of the present invention: and (b)determintng whether the agent binds to said protein 

The present genomic sequences of Staphylococcus aureus will be of great value to all laboratories working with 
this organism and for a variety of commercial purposes Many fragments of the Staphylococcus aureus genome will 
55 be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value 
to Staphylococcus aureus researchers and for immediate commercial value for the production of proteins or to control 
gene expression 

The methodology and technology for elucidating extensive genomic sequences of bacterial and other genomes 
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has and will greatly enhance the ability to analyze and understand chromosomal organization In particular sequenced 
contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and function, 
including the ability to identify genes within large segments of genomic DNA. the structure position, and spacing of 
regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative 
s genomic and molecular phylogeny. 

FIGURE 1 is a block diagram ol a computer system (1 02) that can be used to implement computer-based systems 
of present invention 

FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble edit 
and annotate the contigs of the Staphylococcus aureus genome of the present invention Both Macintosh and Unix 

io platforms are used to handle the AB 373 and 377 sequence data files largely as described in Kerlavage et al , Pro- 
ceedings of the Twenty -Sixth Annual Hawaii international Conference on System Sciences, 585, IEEE Computer So- 
ciety Press Washington D C (1993) Factura (AB) is a Macintosh program designed for automatic vector sequence 
removal and end-trimming of sequence files The program Loadis runs on a Macintosh platform and parses the feature 
data extracted from the sequence files by Factura to the Unix based Staphylococcus aureus relational database As- 

1$ sembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and 
their associated features using extrseq a Unix utility for retrieving sequences from an SQL database. The resulting 
sequence file is processed by seq_filter to trim portions of the sequences with more than 2% ambiguous nucleotides 
The sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic 
Research ( TIGR") for rapid and accurate assembly of thousands of sequence fragments The collection of contigs 

20 generated by the assembly step is loaded into the database with the lassie program Identification of open reading 
frames (ORFs) is accomplished by processing contigs with zorf. The ORFs are searched against S aureus sequences 
from Genbank and against all protein sequences using the BLASTN and BLASTP programs, described in Altschul et 
a! ; J. Mol. Biol. 215 403-410 (1990)). Results of the ORF determination and similarity searching steps were loaded 
into the database As desc ribed below, some results of the determination and the searches are set out in Tables 1 -3 . 

25 The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome and analysis 

of the sequences The primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID 
NOS: 1 -5, 1 9 1 . (As used herein, the "primary sequence" refers to the nucleotide sequence represented by the IUPAC 
nomenclature system.) 

In addition to the aforementioned Staphylococcus aureus polynucleotide and polynucleotide sequences ; the 
30 present invention provides the nucleotide sequences of SEQ ID NOS 1 -5, 1 91 , or representative fragments thereof in 
a form which can be readily used analyzed., and interpreted by a skilled artisan 

As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS:1-5,191" refers 
to any portion of the SEQ ID NOS 1-5,191 which is not presently represented within a publicly available database 
Preferred representative fragments of the present invention are Staphylococcus aureus open reading frames ( ORFs"), 
35 expression modulating fragment ( EMFs") and fragments which can be used to diagnose the presence of Staphyloco- 
ccus aureus in sample ("DFs") A non-limiting identification of preferred representative fragments is provided in Tables 
1-3 

As discussed in detail below, the information provided in SEQ ID NOS: 1-5, 191 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence 
-to all "representative fragments" of interest including open reading frames encoding a large variety of Staphylococcus 
aureus proteins 

While the presently disclosed sequences of SEQ ID NOS:1-5,191 are highly accurate, sequencing techniques are 
not perfect and, in relatively rare instances, further investigation of a fragment or sequence of the invention may reveal 
a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS: 1-5, 191. However, once the 

-J s present invention is made available (i.e., once the information in SEQ ID NOS 1 -5,191 and Tables 1 -3 has been made 
available), resolving a rare sequencing error in SEQ ID NOS 1 -5.191 will be well within the skill of the art The present 
disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to 
be obtained readily by straightforward application of routine techniques Further sequencing of such polynucleotide 
may proceed in like manner using manual and automated sequencing methods which are employed ubiquitous in the 

so art Nucleotide sequence editing software is publicly available For example, Applied Biosystem's (AB) AutoAssembler 
can be used as an aid during visual inspection of nucleotide sequences By employing such routine techniques potential 
errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing 
elfort. also of a routine nature, to the region containing the potential error 

Even if all of the very rare sequencing errors in SEQ ID NOS 1-5.191 were corrected, the resulting nucleotide 

55 sequences would still be at least 95% identical nearly all would be at least 99% identical, and the great majority would 
be at least 99 9% identical to the nucleotide sequences of SEQ ID NOS 1-5 191 

As discussed elsewhere hcrerin polynucleotides of the present invention readily may be obtained by routine ap- 
plication of well known and standard procedures for cloning and sequencing DNA Detailed methods for obtaining 
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librar les and for sequencing are provided below, for instance A wide variety of Staphylococcus aureus strains that can 
be used to prepare S aureus genomic DNA for cloning and for obtaining polynucleotides of the present invention are 
available to the public from recognized depository institutions, such as the American Type Culture Collection (ATCC") 
The nucleotide sequences of the genomes from different strains of Staphylococcus aureus differ somewhat. How- 

5 ever the nucleotide sequences of the genomes of all Staphylococcus aureus strains will be at least 95% identical in 
corresponding part, to the nucleotide sequences provided in SEQ ID NOS1-5.191 Nearly all will be at least 99% 
identical and the great majority will be 99 9% identical 

Thus, the present invention further provides nucleotide sequences which are at least 95%. preferably 99% and 
most preferably 99 9% identical to the nucleotide sequences of SEQ ID NOS 1-5,1 91 , in a form which can be readily 

w used, analyzed and interpreted by the skilled artisan 

Methods for determining whether a nucleotide sequence ts at least 95% at least 99% or at least 99 9% identical 
to the nucleotide sequences of SEQ ID NOS 1-5,191 are routine and readily available to the skilled artisan For example 
the well known fasta algorithm described in Pearson and Lipman, Proc Natl Acad Sci USA 85 2444 (198B) can be 
used to generate the percent identity of nucleotide sequences The BLASTN program also can be used to generate 

>5 an identity score of polynucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS 1-5,191, a representative fragment thereof, or a nucleotide 

20 sequence at least 95%, preferably at least 99% and most preferably at least 99 9% identical to a polynucleotide se- 
quence of SEQ ID NOS 1 -5.191 may be "provided" in a variety of mediums to facilitate use thereof. As used herein 
Oprovided" refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence 
of the present invention, i.e., a nucleotide sequence provided in SEQ ID NOS: 1-5, 191, a representative fragment 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99 9% identical 

25 to a polynucleotide ol SEQ ID NOS: 1 -5,1 91 . Such a manufacture provides a large portion of the Staphylococcus aureus 
genome and parts thereof {e.g.. a Staphylococcus aureus open reading frame (ORF)) in a form which allows a skilled 
artisan to examine the manufacture using means not directly applicable to examining the Staphylococcus aureus ge- 
nome or a subset thereof as it exists in nature or in purified form 

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer 

30 readable media As used herein, "computer readable media" refers to any medium which can be read and accessed 
directly by a computer Such media include, but are not limited to: magnetic storage media, such as floppy discs hard 
disc storage medium and magnetic tape; optical storage media such as CD- ROM: electrical storage media such as 
RAM and ROM: and hybrids of these categories, such as magnetic/optical storage media A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used to create a manufacture com- 

35 prising computer readable medium having recorded thereon a nucleotide sequence of the present invention Likewise, 
it will be clear to those of skill how additional computer readable media that may be developed also can be used to 
create analogous manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer readable medium A skilled 
artisan can readily adopt any of the presently know methods for recording information on computer readable medium 

■to to generate manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium 
having recorded thereon a nucleotide sequence of the present invention The choice of the data storage structure will 
generally be based on the means chosen to access the stored information In addition, a variety of data processor 
programs and formats can be used to store the nucleotide sequence information of the present invention on computer 

J5 readable medium The sequence information can be represented in a word processing text file, formatted in commer- 
cially- available software such as WordPerfect and Microsoft Word or represented in the form of an ASCII file, stored 
m a database application, such as DB2, Sybase, Oracle, or the like A skilled artisan can readily adapt any number of 
data-processor structuring formats (e g , text file or database) in order to obtain computer readable medium having 
recorded thereon the nucleotide sequence information of the present invention 

so Computer software is publicly available which allows a skilled artisan to access sequence information provided m 

a computer readable medium Thus, by providing in computer readable form the nucleotide sequences of SEQ ID 
NOS: 1-5,191. a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and 
most preferably at least 99 9% identical to a sequence of SEQ ID NOS: 1-5, 191 the present invention enables the 
skilled artisan routinely to access the provided sequence information for a wide variety of purposes 

55 The examples which follow demonstrate how software which implements the BLAST (Altschul et al. J Mol. Biol 

215:403410 (1990)) and BLAZE (Brutlag ct al , Comp Chem. 17:203-207 (1993)) search algorithms on a Sybase 
system was used to identify open reading frames (ORFs) within the Staphylococcus aureus genome which contain 
homology to ORFs or proteins from both Staphylococcus aureus and from other organisms. Among the ORFs discussed 
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herein are protein encoding fragments of the Staphylococcus aureus genome useful in producing commercially impor- 
tant proteins such as enzymes used in fermentation reactions and in the production of commercially useful metabolites 

The present invention further provides systems particularly computer-based systems, which contain the sequence 
information described herein Such systems are designed to identify, among other things, commercially important frag- 
5 mentsofthe Staphylococcus aureus genome. 

As used herein, "a computer-based system' refers to the hardware means, software means and data storage 
means used to analyze the nucleotide sequence information of the present invention The minimum hardware means 
of the computer-based systems of the present invention comprises a central processing unit (CPU), input means 
ou'put means and data storage means A skilled artisan can readily appreciate that any one of the currently available 
w computer-based system are suitable for use in the present invention 

As stated above the computer-based systems of the present invention comprise a data storage means having 
stored therein a nucleotide sequence of the present invention and the nec essary hardware means and software means 
for supporting and implementing a search means 

As used herein "data storage means" refers to memory which can store nucleotide sequence information of the 
<5 present invention or a memory access means which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention 

As used herein "search means" refers to one or more programs which are implemented on the computer- based 
system to compare a target sequence or target structural motif with the sequence information stored within the data 
storage means Search means are used to identify fragments or regions of the present genomic sequences which 
20 match a particular taigei sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the computer-based systems 
of the present invention Examples of such software includes but is not limited to, MacPattern (EMBL) BLASTN and 
BLASTX (NCBIA) A skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present computer-based systems 
25 As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 

or more amino acids A skilled artisan can readily recognize that the longer a target sequence is. the less likely a target 
sequence will be present as a random occurrence in the database The most preferred sequence length of a target 
sequence is from about 1 0 to 1 00 amino acids or from about 30 to 300 nucleotide residues However it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in gene expression and 
30 protein processing may be of shorter length 

As used herein, "a target structural motif," or "target motif/' refers to any rationally selected sequence or combi- 
nation of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed 
upon the folding of the target motif There are a variety of target motifs known in the art Protein target motifs include, 
but are not limited to, enzymic active sites and signal sequences Nucleic acid target motifs include, but are not limited 
35 to promoter sequences hairpin structures and inducible expression elements (protein binding sequences) 

A variety of structural formats for the input and output means can be used to input and output the information in 
the computer-based systems of the present invention A preferred format for an output means ranks fragments of the 
Staphylococcus aureus genomic sequences possessing varying degrees of homology to the target sequence or target 
motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the 
J0 target sequence or target motif and identifies the degree of homology contained in the identified fragment 

A variety of comparing means can be used to compare a target sequence or target motif with the data storage 
means to identify sequence fragments of the Staphylococcus aureus genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in Aftschul et al , J. Mo! Biol 215 : 403-410 
(1990), was used to identify open reading frames within the Staphylococcus aureus genome. A skilled artisan can 
4 5 readily recognize that any one of the publicly available homology search programs can be used as Ihe search means 
for the computer-based systems of the present invention Of course, suitable proprietary systems that may be known 
to those of skill also may be employed in this regard 

Figure 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present 
invention The computer system 1 02 includes a processor 1 06 connecfed to a bus 104 Also connected to the bus 104 
50 are a main memory 1 08 (preferably implemented as random access memory. RAM) and a variety of secondary storage 
devices 1 1 0. such as a hard drive 1 1 2 and a removable medium storage device 1 1 4 The removable medium storage 
device 114 may represent, for example a floppy disk drive a CD-ROM drive, a magnetic tape drive etc A removable 
storage medium 1 1 6 (such as a floppy disk, a compact disk a magnetic tape, etc ) containing control logic and/or data 
recorded therein may be inserted into the removable medium storage device 114 The computer system 102 includes 
55 appropriate software for reading the control logic and/or the data from the removable medium storage device 1 1 4. once 
it ts inserted into the removable medium storage device 114 

A nucleotide sequence of the present invention may be stored in a well known manner in the main memory 103 
any of the secondary storage devices 110. and/or a removable storage medium 116 During execution software for 
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accessing and processing the genomic sequence (such as search tools comparing tools etc.) reside in main memory 
108, in accordance with the requirements and operating parameters of the operating system, the hardware system 
and the software program or programs 

5 BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to fragments of the Staphylococcus aureus genome, 
preferably to isolated fragments. The fragments of the Staphylococcus aureus genome of the present invention include, 
but are not limited to fragments which encode peptides hereinafter open reading frames (ORFs), fragments which 
10 modulate the expression of an operably linked ORF. hereinafter expression modulating fragments (EMFs) and frag- 
ments which can be used to diagnose the presence of Staphylococcus aureus in a sample, hereinafter diagnostic 
fragments (DFs) 

As used herein an "isolated nucleic acid molecule" or an "isolated fragment of the Staphylococcus aureus genome" 
refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification 
'5 moans to reduce, from the composition, the number of compounds which are normally associated with the composition 
Particularly, the term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS 1-5,191 , to 
representative fragments thereof as described above, to polynucleotides at least 95%, preferably at least 99% and 
especially preferably at least 99.9% identical in sequence thereto, also as set out above 

A variety of purification means can be used to generated the isolated fragments of the present invention These 
20 include, but are not limited to methods which separate constituents of a solution based on charge solubility, or size 

In one embodiment, Staphylococcus aureus DNA can be mechanically sheared to produce fragments of 1 5 20 kb 
in length These fragments can then be used to generate an Staphylococcus aureus library by inserting them into 
lambda clones as described in the Examples below Primers flanking, for example, an ORF, such as those enumerated 
in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-5,191 Well 
25 known and routine techniques of PCR cloning then can be used to isolate the ORF from the lambda DNA library of 
Staphylococcus aureus genomic DNA Thus given the availability of SEQ ID NOS 1-5.191. the information in Tables 
1, 2 and 3. and the information that may be obtained readily by analysis of the sequences of SEQ ID NOS 1-5,191 
using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-contammg or 
other nucleic acid fragment of the present invention 
30 The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and 

double stranded DNA, and single stranded RNA 

As used herein, an "open reading frame," ORF, means a series of triplets coding for amino acids without any 
termination codons and is a sequence translatable into protein 

Tables 1, 2 and 3 list ORFs in the Staphylococcus aureus genomic contigs of the present invention that were 
35 identified as putative coding regions by the GeneMark software using organism-specific second-order Markov proba- 
bility transition matrices It will be appreciated that other criteria can be used, in accordance with well known analytical 
methods, such as those discussed herein, to generate more inclusive, more restrictive or more selective lists 

Table 1 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are at least 80 ammo 
acids long and over a continuous region of at least 50 bases which are 95% or more identical (by BLAST analysis) to 
io an S aureus nucleotide sequence available through Genbank in November 1996. 

Table 2 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are not in Table i and 
match, with a BLASTP probability score of 0 01 or less, a polypeptide sequence available through Genbank by Sep- 
tember 1996 

Table 3 sets out ORFs in the Staphylococcus aureus contigs of the present invention that do not match significantly, 
15 by BLASTP analysts, a polypeptide sequence available through Genbank by September 1996 

In each table, the first and second columns identify the ORF by, respectively, contig number and ORF number 
within the contig: the third column indicates the reading frame, taking the first 5' nucleotide of the contig as the start of 
the +1 frame: the fourth column indicates the first nucleotide of the ORF, counting from the 5' end of the contig strand 
and the fifth column indicates the length of each ORF in nucleotides 
50 in Tables 1 and 2, column six lists the Reference" for the closest matching sequence available through Genbank 

These reference numbers are the databases entry numbers commonly used by those ol skill in the art who will be 
familiar with their denominators Descriptions of the numenclature are available from the National Center for Biotech- 
nology Information Column seven in Tables 1 and 2 provides the gene name" of the matching sequence; column eight 
provides the BLAST identity" score from the comparison of the ORF and the homologous gene: and column nine 
55 indicates the length in nucleotides of the highest scoring segment pair" identified by the BLAST identity analysis. 

In Table 3 the last column column six indicates the length of each ORF in amino acid residues 

The concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art 
For example two polypeptides 10 amino acids in length which differ at three ammo acid positions {e g., at positions 
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1 , 3 and 5} are said to have a percent identity of 70% However, the same two polypeptides would be deemed to have 
a percent similarity of 80% if. for example at position 5, the amino acids moieties, although not identical, were "similar" 
(i.e., possessed similar biochemical characteristics) Many programs for analysis of nucleotide or amino acid sequence 
similarity such as fasta and BLAST specifically list per cent identity of a matching region as an output parameter Thus, 
s for instance, Tables 1 and 2 herein enumerate the per cent identity" of the highest scoring segment pair" tn each ORF 
and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations provided below 

ll will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the 
types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful Thus a skilled 
to artisan can readily identify ORFs in contigs of the Staphylococcus aureus genome other than those listed in Tables 
1-3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention 

As used herein, an "expression modulating fragment " EMF. means a series of nucleotide molecules which mod- 
ulates the expression of an operably linked ORF or EMF 
>5 As used herein a sequence is said to "modulate the expression of an operably linked sequence" when the ex- 

pression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression 
or an operably linked ORF in response to a specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Staphylococcus aureus genome by their proximity to 
20 the ORFs provided in Tables 1-3 An intergenic segment, or a fragment of the intergenic segment, from about 10 to 
200 nucleotides in length, taken from any one of the ORFs of Tables 1 -3 will modulate the expression of an operably 
linked ORF in a fashion similar to that found with the naturally linked ORF sequence As used herein, an "intergenic 
segment" refers to fragments of the Staphylococcus aureus genome which are between twoORF(s) herein described. 
EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems 
25 of the present invention Further, the two methods can be combined and used together 

The presence and activity of an EMF can be confirmed using an EMF trap vector An EMF trap vector contains a 
cloning site linked to a marker sequence. A marker sequence encodes an identifiable phenotype, such as antibiotic 
resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap 
vector is placed within an appropriate host under appropriate conditions As described above, a EMF will modulate the 
JO expression of an operably linked marker sequence A more detailed discussion of various marker sequences ts provided 
below 

A sequence which is suspected as being an EMF is cloned tn all three reading frames in one or more restriction 
sites upstream from the marker sequence in the EMF trap vector The vector is then transformed into an appropriate 
host using known procedures and the phenotype of the transformed host in examined under appropriate conditions 
35 As described above, an EMF will modulate the expression of an operably linked marker sequence. 

As used herein, a "diagnostic fragment," DF, means a series of nucleotide molecules which selectively hybridize 
to Staphylococcus aureus sequences DFs can be readily identified by identifying unique sequences within contigs of 
the Staphylococcus aureus genome, such as by using well-known computer analysis software, and by generating and 
testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which de- 
■to termines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not limited to the specific sequences herein 
described but also include allelic and species variations thereof Allelic and species variations can be routinely deter- 
mined by comparing the sequences provided in SEQ ID NOS1 -5, 1 91 , a representative fragment thereof, or a nucleotide 
sequence at least 95%. preferably 99% and most preferably 99 9% identical to SEQ ID NOS:1 -5.191 , with a sequence 
■JS from another isolate of the same species 

Furthermore, to accomodate codon variability, the invention includes nucleic acid molecules coding for the same 
ammo acid sequences as do the nucleic acid sequences mentioned above In other words, in the coding region of an 
ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated 

Any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment. 
50 such as an ORF in both directions (/ e , sequence both strands) Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Staphylococcus aureus origin isolated by using part or all of the fragments 
in question as a probe or primer 

Each of the ORFs of the Staphylococcus aureus genome disclosed in Tables 1 , 2 and 3. and the EMFs found 5' 
to the ORFs. can be used as polynucleotide reagents in numerous ways For example, the sequences can be used 
55 as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, 
particular Staphylococcus aureus Especially preferred in this regard are ORF such as those of Table 3. which do not 
match previously characterized sequences from other organisms and thus are most likely to be highly selective for 
Staphylococcus aureus Also particularly preferred are ORFs that can be used to distinguish between strains of Sta- 
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phylococcus aureus, particularly those that distinguish medically important strain, such as drug-resistant strains 

In addition, the fragments of the present invention as broadly described, can be used to control gene expression 
through triple helix formation or antisense DNA or RNA. both of which methods are based on the binding of a polynu- 
cleotide sequence to DNA or RNA. Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, 
5 while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides Polynu- 
cleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary 
to a region of the gene involved in transcription, for triple-helix formation or to the mRNA itself for antisense inhibition 
Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known 
w and involve routine procedures Triple helix techniques are discussed in, for example Lee et a\ , Nuct Acids Res 6: 
3073(1979) Cooney et ai , Science 241 456 (1988) and Dervan et ai , Science 251 1360(1991) Antisense tech- 
niques in general are discussed in, for instance, Okano, J. Neurochem 56: 560 (1991) and OLIGODEOXYNUCLE- 
OTIDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION CRC Press Boca Raton FL (1986)) 

The present invention further provides recombinant constructs comprising one or more fragments of the Staphy- 
'5 fococcus aureus genomic fragments and contigs of the present invention. Certain preferred recombinant constructs of 
the present invention comprise a vector such as a plasmid or viral vector into which a fragment of the Staphylococcus 
aureus genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the 
ORFs of the present invention, the vector may further comprise regulatory sequences including for example, a pro- 
moter operably linked to the ORE For vectors comprising the EMFs of the present invention, the vector may further 
20 comprise a marker sequence or heterologous ORF operably linked to the EMF. 

Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention The following vectors are provided by 
way of example Useful bacterial vectors include phagescript PsiX174 ; pBluescript SK and KS (+ and -), pNH8a, 
pNHi6a. pNHISa, pNH46a (available from Stratagene); pTrc99A, pKK223-3. pKK233-3, pDR540. pRIT5 (available 
25 from Pharmacia) Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXT1 , pSG (available from Stratagene) 
pSVK3. pBPV, pMSG, pSVL (available from Pharmacia) 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other 
vectors with selectable markers Two appropriate vectors are pKK232-B and pCM7 Particular named bacterial pro- 
moters include lacl lacZ, T3, T7, gpt, lambda PR. and trc Eukaryotic promoters include CMV immediate early, HSV 
30 thymidine kinase, early and late SV40 ; LTRs from retrovirus, and mouse metallothionem- I Selection of the appropriate 
vector and promoter is well within the level of ordinary skill in the art 

The present invention further provides host cells containing any one of the isolated fragments of the Staphylococcus 
aureus genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the 
host cell using known methods The host cell can be a higher eukaryotic host cell such as a mammalian cell, a lower 
35 eukaryotic host cell such as a yeast cell, or a procaryotic cell, such as a bacterial cell 

A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present 
invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such 
as calcium phosphate transfection, DEAE dextran mediated transfection and electroporation, which are described in, 
for instance, Davis, L. ct al, BASIC METHODS IN MOLECULAR BIOLOGY (1986) 
•to A host cell containing one of the fragments of the Staphylococcus aureus genomic fragments and contigs of the 

present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment 
(in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF 

The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant" is 
■J5 intended nucleotide fragments which differ from a nucleic acid fragment of the present invention {e.g., an ORF) by 
nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence 

Preferred nucleic acid fragments of the present invention are the ORFs depicted in Tables 2 and 3 which encode 
proteins 

A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins 
50 of the present invention At the simplest level, the amino acid sequence can be synthesized using commercially avail- 
able peptide synthesizers This is particularly useful in producing small peptides and fragments of larger polypeptides 
Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below 

In an alternative method the polypeptide or protein is purified from bacterial cells which naturally produce the 
55 polypeptide or protein One skilled in the art can readily employ well-known methods for isolating polpeptides and 
proteins to isolate and purify polypeptides or proteins of the present invention produced naturally by a bacterial strain, 
or by other methods Methods for isolation and purification that can be employed in this regard include, but are not 
limited to. immunochromatography. HPLC. size-exclusion chromatography, ion-exchange chromatography, and immu- 
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no-affinity chromatography 

The polypeptides and proteins of the present invention also can be purified from celts which have been altered to 
express the desired polypeptide or protein. As used herein, a celt is said to be altered to express a desired polypeptide 
or protein when the cell through genetic manipulation, is made to produce a polypeptide or protein which it normally 

5 does not produce or which the cell normally produces at a lower level Those skilled in the art can readily adapt pro- 
cedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells 
in order to generate a cell which produces one of the polypeptides or proteins of the present invention 

Any host/vector system can be used to express one or more of the ORFs of the present invention These include 
but are not limited to eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells and Sf9 cells, as well as prokaryotic 

w host such as E coli and B subtilis The most preferred cells are those which do not normally express the particular 
polypeptide or protein or which expresses the polypeptide or protein at low natural level 

"Recombinant/ as used herein means that a polypeptide or protein is derived from recombinant (eg., microbial 
or mammalian) expression systems "Microbial" refers to recombinant polypeptides or proteins made in bacterial or 
fungal (eg , yeast) expression systems As a product, "recombinant microbial"defmes a polypeptide or protein essen- 

'5 tially free of native endogenous substances and unaccompanied by associated native glycosylation Polypeptides or 
proteins expressed in most bacterial cultures, e.g., E. cofi r will be free of glycosylation modifications, polypeptides or 
proteins expressed in yeast will have a glycosylation pattern different from that expressed tn mammalian cells 

"Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides. Generally, DNA segments encoding the 
polypeptides and proteins provided by this invention are assembled from fragments of the Staphylococcus aureus 

<?o genome and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is 
capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a mi- 
crobial or viral operon 

"Recombinant expression vehicle or vector" refers to a plasmid or phage or virus or vector, for expressing a polypep- 
tide from a DNA (RNA) sequence The expression vehicle can comprise a transcriptional unit comprising an assembly 

-5 of (1 ) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate 
and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, 
promoters and where necessary, an enhancers and a polyadenylation signal; (2) a structural or coding sequence 
which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the 
beginning of the desired coding region and terminate translation at its end Structural units intended for use in yeast 

30 or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated 
protein by a host cell Alternatively, where recombinant protein is expressed without a leader or transport sequence 
it may include an N-termmal methionine residue This residue may or may not be subsequently cleaved from the 
expressed recombinant protein to provide a final product. 

"Recombinant expression system" means host cells which have stably integrated a recombinant transcriptional 

35 unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally The cells can be prokary- 
otic or eukaryotic Recombinant expression systems as defined herein will express heterologous polypeptides or pro- 
teins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appro- 
priate promoters Cell-free translation systems can also be employed to produce such proteins using RNAs derived 

io from the DNA constructs of the present invention Appropriate cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described in Sambrook etal, MOLECULAR CLONING: A LABORATORY MANUAL, 2 nd Edi- 
tion, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which is hereby 
incorporated by reference in its entirety 

Generally, recombinant expression vectors will include origins of replication and selectable markers permitting 

4 5 transformation of the host cell, eg , the ampicillin resistance gene of E coli and S cerevtsiae TRP1 gene, and a 
promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence Such 
promoters can be derived from operons encoding glycolytic en/ymes such as 3-phosphoglycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others The heterologous structural sequence is assembled 
in appropriate phase with translation initiation and termination sequences, and preferably a leader sequence capable 

so of directing secretion of translated protein into the penpiasmic space or extracellular medium Optionally the heterol- 
ogous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired charac- 
teristics, e g , stabilization or simplified purification of expressed recombinant product 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a 
desired protein together with suitable translation initiation and termination signals in operable reading phase with a 

5$ functional promoter The vector will comprise one or more phenotypic selectable markers and an origin of replication 
to ensure maintenance of the vector and. when desirable, provide amplification within the host 

Suitable prokaryotic hosts for transformation include strains of Staphylococcus aureus, E colt, B subtilis. Salmo- 
nella typhtmunum and various species within the genera Pscudomonas, Streptomyces. and Staphylococcus Others 
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may. also be employed as a matter of choice 

As a representative but non-limiting example useful expression vectors for bacterial use can comprise a selectable 
marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements 
of the well known cloning vector pBR322 (ATCC 37017) Such commercial vectors include, for example pKK223-3 
5 (available form Pharmacia Fine Chemicals. Uppsala. Sweden) and GEM 1 (available from Promega Biotec. Madison. 
Wl USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence 
to be expressed 

Following transformation of a suitabie host strain and growth of the host strain to an appropriate cell density the 
selected promoter where it is inducible, is derepressed or induced by appropriate means {e.g., temperature shift or 
w chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product 
Thereafter cells are typically harvested, generally by centnfugation, disrupted to release expressed protein generally 
by physical or chemical means and the resulting crude extract is retained for further purification 

Various mammalian cell culture systems can also be employed to express recombinant protein Examples of mam- 
malian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23 175 
'5 (1981), and other cell linos capable of expressing a compatible vector, for example, the C127. 3T3, CHO HeLa and 
BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer and also 
any necessary ribosome binding sites polyadenylation site, splice donor and acceptor sites transcriptional termination 
sequences, and 5' flanking nontransenbed sequences. DNA sequences derived from the SV40 viral genome, for ex- 

20 ample, SV40 origin early promoter enhancer splice, and polyadenylation sites may be used to provide the required 
nontranscribed genetic elements 

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Mi- 
crobial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw 

-5 cycling, sonication, mechanical disruption, or use of cell lysing agents Protein refolding steps can be used, as neces- 
sary, in completing configuration of the mature protein Finally, high performance liquid chromatography (HPLC) can 
be employed for final purification steps 

An additional aspect of the invention includes Staphylococcus aureus polypeptides which are useful as immuno- 
diagnostic antigens and/or immunoprotective vaccines, collectively "immunologically useful polypeptides' Such im- 

30 munologically useful polypeptides may be selected from the ORFs disclosed herein based on techniques well known 
m the art and described elsewhere herein The inventors have used the following criteria to select several immunolog- 
ically useful polypeptides 

As is known in the art, an ammo terminal type I signal sequence directs a nascent protein across the plasma and 
outer membranes to the exterior of the bacterial cell Such outermembrane polypeptides are expected to be immuno- 

35 logically useful According to Izard, J W et al , Mol Microbiol 13, 765-773; (1994), polypeptides containing type I 
signal sequences contain the following physical attributes: The length of the type I signal sequence is approximately 
15 to 25 primarily hydrophobic amino acid residues with a net positive charge in the extreme amino terminus; the 
central region of the signal sequence must adopt an alpha-helical conformation in a hydrophobic environment; and the 
region surrounding the actual site of cleavage is ideally six residues long, with small side-chain amino acids in the -1 

JO and -3 positions 

Also known in the art is the type IV signal sequence which is an example of the several types of functional signal 
sequences which exist in addition to the type I signal sequence detailed above. Although functionally related the type 
IV signal sequence possesses a unique set of biochemical and physical attributes (Strom, M S. and Lory, S , J Bac- 
terid 174, 7345-7351, 1992)). These are typically six to eight amino acids with a net basic charge followed by an 

4 $ additional sixteen to thirty primarily hydrophobic residues The cleavage site of a type IV signal sequence is typically 
after the initial six to eight amino acids at the extreme amino terminus In addition, all type IV signal sequences contain 
a phenylalanine residue at the +1 site relative to the cleavage site 

Studies of the cleavage sites of twenty-six bacterial lipoprotein precursors has allowed the definition of a consensus 
amino actd sequence for lipoprotein cleavage Nearly three-fourths of the bacterial lipoprotein precursors examined 

so contained the sequence L-(A S)-(G A)-C at positions -3 to +1 . relative to the point of cleavage (Hayashi S and Wu. 
H C Lipoproteins in bacteria J Bioenerg Biomembr 22,451-471; 1990) 

It well known that most anchored proteins found on the surface of gram-positive bacteria possess a highly con- 
served carboxy terminal sequence More than fifty such proteins from organisms such as S pyogenes. S mutans, E 
faecahs, S pneumoniae, and others have been identified based on their extracellular location and carboxy terminal 

55 ammo acid sequence (Fischctti, V A Gram-positive commensal bacteria deliver antigens to elicit mucosal and systemic 
immunity ASM News 62 405410: 1996) The conserved region is comprised of six charged amino acids at the extreme 
carboxy terminus coupled to 15-20 hydrophobic amino acids presumed to function as a transmembrane domain Im- 
mediately adjacent to the transmembrane domain is a six ammo acid sequence conserved in nearly all proteins ex- 
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amined. The ammo acid sequence of this region is L-P-X-T-G-X. where X is any amino acid 

Amino acid sequence similarities to proteins of known function by BLAST enables the assignment of putative 
functions to novel amino acid sequences and allows for the selection of proteins thought to function outside the cell 
wall Such proteins are well known in the art and include "lipoprotein", "periplasms" or "antigen" 

s An algorithm for selecting antigenic and immunogenic Staphylococcus aureus polypeptides including the foregoing 

criteria was developed by the present inventors. Use of the algorithm by the inventors to select immunologically useful 
Staphylococcus aureus polypeptides resulted in the selection of several ORF s which are predicted to be outermem- 
brane-associated proteins These proteins are identified in Table 4, below, and shown in the Sequence Listing as SEQ 
I D NOS 5 1 92 to 5 255 Thus the amino acid sequence of each of several ant i gen \cStaphyfococcus aureus polypeptides 

10 listed in Table 4 can be determined, for example, by locating the amino acid sequence of the ORF in the Sequence 
Listing Likewise the polynucleotide sequence encoding each ORF can be found by locating the corresponding poly- 
nucleotide SEQ ID in Tables 1, 2, or 3 ; and finding the corresponding nucleotide sequence in the sequence listing 

As will be appreciated by those of ordinary skill in the art, although a polypeptide representing an entire ORF may 
be the closest approximation to a protein found in vrvo, it is not always technically practical to express a complete ORF 

'5 in vitro. It may be very challenging to express and purify a highly hydrophobic protein by common laboratory methods 
As a result the immunologically useful polypeptides described herein as SEQ ID NOS 5.192-5 255 may have been 
modified slightly to simplify the production of recombinant protein, and are the preferred embodiments. In general 
nucleotide sequences which encode highly hydrophobic domains, such as those found at the ammo terminal signal 
sequence, are excluded for enhanced in vitro expression of the polypeptides. Furthermore, any highly hydrophobic 

20 amino acid sequences occurring at the carboxy terminus are also excluded Such truncated polypeptides include for 
example the mature forms of the polypeptides expected to exist in nature. 

Those of ordinary skill in the art can identify soluble portions the polypeptide identified m Table 4. and in the case 
of truncated polypeptides sequences shown as SEQ ID NOS:5, 192-5,255, may obtain the complete predicted ammo 
ac id sequence of each polypeptide by translating the corresponding polynucleotides sequences of the corresponding 

25 ORF listed in Tables 1,2 and 3 and found in the sequence listing 

Accordingly polypeptides comprising the complete amino acid of an immunologically useful polypeptide selected 
from the group of polypeptides encoded by the ORFs identified in Table 4, or an amino acid sequence at least 95% 
identical thereto, preferably at least 97% identical thereto, and most preferably at least 99% identical thereto form an 
embodiment of the invention; in addition polypeptides comprising an amino acid sequence selected from the group of 

30 amino acid sequences shown in the sequence listing as SEQ ID NOS 5,1 91 -5.255. or an amino acid sequence at least 
95% identical thereto, preferably at least 97% identical thereto and most preferably at least 99% identical thereto, form 
an embodiment of the invention Polynucleotides encoding the foregoing polypeptides also form part of the present 
invention 

In another aspect, the invention provides a peptide or polypeptide comprising an epitope-bearing portion of a 

3$ polypeptide of the invention, particularly those epitope-bearing portions (antigenic regions) identified in Table 4 The 
epitope-bearing portion is an immunogenic or antigenic epitope of a polypeptide of the invention An "immunogenic 
epitope" is defined as a part of a protein that elicits an antibody response when the whole protein is the immunogen. 
On the other hand, a region of a protein molecule to which an antibody can bind is defined as an "antigenic epitope." 
The number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, for 

40 instance, Geysen etal., Proc. Natl. Acad Sci USA 81:3998- 4002 (1983) 

As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e.. that contain a region of a protein 
molecule to which an antibody can bind), it is well known in that art that relatively short synthetic peptides that mimic 
part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein 
See, for instance, Sutctiffe. J G. r Shinnick, T. M., Green, N. and Learner, R. A. (1983) "Antibodies that react with 

^5 predetermined sites on proteins", Science, 219 660-666 Peptides capable of eliciting protein-reactive sera are fre- 
quently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and 
are confined neither to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or 
carboxyl terminals. Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise 
antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention See. for instance. 

50 Wilson et al , Cell 37 767-778 (1984) at 777 

Antigenic epitope-bearing peptides and polypeptides of the invention preferably contain a sequence of at least 
seven, more preferably at least nine and most preferably between about 15 to about 30 ammo acids contained within 
the amino acid sequence of a polypeptide of the invention Non -limiting examples of antigenic polypeptides or peptides 
that can be used to generate S aureus specific antibodies include: a polypeptide comprising peptides shown in Table 

55 4 below These polypeptide fragments have been determined to bear antigenic epitopes of indicated S aureus proteins 
by the analysis of the Jameson-Wolf antigenic index a representative sample of which is shown in Figure 3 

The epitope-bearing peptides and polypeptides of the invention may be produced by any conventional means 
See e g Houghten R A (1935) General method for the rapid solid-phase synthesis of large numbers of peptides: 
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specificity of antigen-antibody interaction at the level of individual amino acids Proc Natl Acad. Sci USA 62 
5131-5135 this "Simultaneous Multiple Peptide Synthesis (SMPS)" process is further described in U S Patent No. 
4 631.211 to Houghten et al. (1986) Epitope-beanng peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known in the art See. for instance. Sutcliffe et al., supra, Wilson et al., supra; 

5 Chow. M et al.. Proc Natl Acad Sci USA 62 910-914. and Bittle, F J et a! , J Gen. Virol 66 2347-2354 (1985). 

Immunogenic epitope-beanng peptides of the invention, i e. those parts of a protein that elicit an antibody response 
when the whole protein is the immunogen, are identified according to methods known m the art See, for instance. 
Geysen et al supra Further still U.S. Patent No 5,194.392 toGeysen (1990) describes a general method of detecting 
or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the 

1 0 epitope (i e a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of 
■nterest More generally, U S Patent No 4,433.092 to Geysen (1 989) describes a method of detecting or determining 
a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the hgand binding 
site of a particular receptor of interest Similarly U S Patent No 5,480 971 to Houghten, R A et al (1996) on Per- 
alkylated Oligopeptide Mixtures discloses linear C1 -C7-alkyl peraikylated oligopeptides and sets and libraries of such 

'5 peptides, as well as methods for using such oligopeptide sets and libraries for determining the sequence of a per- 
aikylated oligopeptide that preferentially binds to an acceptor molecule of interest Thus non-peptido analogs of the 
epitope-beanng peptides of the invention also can be made routinely by these methods 

Table 4 lists immunologically useful polypeptides identified by an algorithm which locates novel Staphylococcus 
aureus outermembrane proteins as is described above Also listed are epitopes or "antigenic regions" of each of the 

20 identified polypeptides The antigenic regions, or epitopes, are delineated by two numbers x-y, where x is the number 
of the first amino acid in the open reading frame included within the epitope and y is the number of the last amino acid 
m the open reading frame included within the epitope For example, the first epitope in ORF 168-6 is comprised of 
ammo acids 36 to 45 of SEQ ID NO 5,192, as is described in Table 4. The inventors have identified several epitopes 
for each of the antigenic polypeptides identified in Table 4. Accordingly, forming part of the present invention are 

25 polypeptides comprising an amino acid sequence of one or more antigenic regions identified in Table 4 The invention 
further provides polynucleotides encoding such polypeptides 

The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are sub- 
stantially equivalent to those herein described As used herein substantially equivalent can refer both to nucleic acid 
and ammo acid sequences, for example a mutant sequence, that vanes from a reference sequence by one or more 

30 substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity be- 
tween reference and subject sequences For purposes of the present invention, sequences having equivalent biological 
activity and equivalent expression characteristics are considered substantially equivalent For purposes of determining 
equivalence, truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homologs from other strains of Staphylococcus aureus, of the 

35 fragments of the Staphylococcus aureus genome of the present invention and homologs of the proteins encoded by 
the ORFs of the present invention As used herein, a sequence or protein of Staphylococcus aureus is defined as a 
homolog of a fragment of the Staphylococcus aureus fragments or contigs or a protein encoded by one of the ORFs 
of the present invention, if it shares significant homology to one of the fragments of the Staphylococcus aureus genome 
of the present invention or a protein encoded by one of the ORFs of the present invention Specifically by using the 

•*o sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybrid- 
ization, one skilled in the art can obtain homologs 

As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain 
regions which prossess greater than 85% sequence (ammo acid or nucleic acid) homology Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those with 93% or more homology Among 
especially preferred homologs those with 95% or more homology are particularly preferred Very particularly preferred 
among these are those with 97% and even more particularly preferred among those are homologs with 99% or more 
homology The most preferred homologs among these are those with 99 9% homology or more It will be understood 
that, among measures of homology, identity is particularly preferred in this regard 

Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS 1 -5, 1 91 or from 

50 a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99 5% tdentical to a sequence of SEQ 
ID NOS: 1 -5, 1 91 can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies containing 
cloned DNA encoding a homolog Methods suitable to this aspect of the present invention are well known and have 
been described in great detail in many publications such as. for example. Innis et al . PCR PROTOCOLS. Academic 
Press San Diego, CA (1990)) 

55 When using primers derived from SEQ ID NOS; 1-5. 191 or from a nucleotide sequence having an aforementioned 

identity to a sequence of SEQ ID NOS:l -5 191 . one skilled in the art will recognize that by employing high stringency 
conditions {eg., annealing at 50-60°C in 6X SSPC and 50% formamide. and washing at 50- 65°C in 0.5X SSPC) only 
sequences which are greater than 75% homologous to the primer will be amplified. By employing lower stringency 
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conditions (eg, hybridizing at 35-37°C in 5X SSPC and 40-45% tormamide, and washing at 42 3 C in 0.5X SSPC), 
sequences which are greater than 40-50% homologous to the primer will also be amplified 

When using DNA probes derived from SEQ ID NOS 1 -5, 1 91 . or from a nucleotide sequence having an aforemen- 
tioned identity to a sequence of SEQ I D NOS; 1 -5, 1 91 , for colony/plaque hybridization, one skilled in the art will recog- 

5 nize that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide. and 
washing at 50- 65°C in 0.5X SSPC), sequences having regions which are greater than 90% homologous to the probe 
can be obtained, and that by employing lower stringency conditions {e.g., hybridizing at 35-37°C in 5X SSPC and 
40-45% formamide, and washing at 42°C in 0 5X SSPC), sequences having regions which are greater than 35-45% 
homologous to the probe will be obtained. 

10 Any organism can be used as the source for homologs of the present invention so long as the organism naturally 

expresses such a protein or contains genes encoding the same The most preferred organism for isolating homologs 
are bacterias which are closely related to Staphylococcus aureus 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION 

15 

Each ORF provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide 
As a result, one skilled in the art can use the polypeptides of the present invention for commercial therapeutic and 
industrial purposes consistent with the type of putative identification of the polypeptide Such identifications permit one 
skilled in the art to use the Staphylococcus aureus ORFs in a manner similar to the known type of sequences for which 

20 the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite. A 
variety of reviews illustrative of this aspect of the invention are available, including the following reviews on the industrial 
use of enzymes, for example, BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK, 2nd Ed., Mac- 
millan Publications. Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper etai, Eds., Elsevier 
Science Publishers, Amsterdam, The Netherlands (1985). A variety of exemplary uses that illustrate this and similar 

25 aspects of the present invention are discussed below 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and 

30 macromolecular metabolism, the biosynthesis of small molecules cellular processes and other functions includes en- 
zymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary 
metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation enzymes 
involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in 
ammo acid synthesis, enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, 

35 can be used for industrial biosynthesis 

The various metabolic pathways present in Staphylococcus aureus can be identified based on absolute nutritional 
requirements as well as by examining the various enzymes identified in Table 1-3 and SEQ ID NOS:1-5,191. 

Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non- 
macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase 

40 Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a 

number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification 
and depectinization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to 
give unicellular fruits. A detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts 
etai, SymbiosisgV. 79 (1986) and Voragen etal. in BIOCATALYSTS IN AGRICULTURAL BIOTECHNOLOGY, Whitak- 

■ts er et al, Eds , American Chemical Society Symposium Series 389 : 93 (1989) 

The metabolism of sugars is an important aspect of the primary metabolism of Staphylococcus aureus. Enzymes 
involved in the degradation of sugars, such as, particularly glucose, galactose, fructose and xylose, can be used in 
industrial fermentation Some of the important sugar transforming enzymes, from a commercial viewpoint, include 
sugar isomerases such as glucose isomerase. Other metabolic enzymes have found commercial use such as glucose 

50 oxidases which produces ketogulonic acid (KGA) KGA is an intermediate in the commercial production of ascorbic 
acid using the Reichstem's procedure, as described in Krueger et al.. Biotechnology 6(A), Rhine et al , Eds Verlag 
Press, Weinheim. Germany (1984) 

Glucose oxidase (GOD) is commercially available and has been used in purified form as well as in an immobilized 
form for the deoxygenation of beer See, for instance Harlmeir et at.. Biotechnology Letters V 21 (1979) The most 

55 important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are 
used in the detergent textile, leather, photographic, pharmaceutical, food, feed and concrete industry as described, 
for example, in Bigelis ct al.. beginning on page 357 in GENE MANIPULATIONS AND FUNGI, Benett ct al., Eds . 
Academic Press New York (1965). In addition to industrial applications, GOD has found applications in medicine for 
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quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and 
cellulose hydrosylates. This application is described in Owusu era/. Biochem et Biophysica Acta 672 83 (1986) for 
instance. 

The main sweetener used in the world today is sugar which comes from sugar beets and sugar cane In the field 

5 of industrial enzymes, the glucose isomerase process shows the largest expansion in the market today Initially, soluble 
enzymes were used and later immobilized enzymes were developed (Krueger et al , Biotechnology, The Textbook of 
Industrial Microbiology. Smauer Associated Incorporated, Sunderland. Massachusetts (1990)) Today, the use of glu- 
cose- produced high fructose syrups is by far the largest industrial business using immobilized enzymes A review of 
the industrial use of these enzymes is provided by Jorgensen, Starch 40 307 (1983) 

w Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus represent one of the 

largest volumes of microbial enzymes used in the industrial sector Because of their industrial importance, there is a 
large body of published and unpublished information regarding the use of these enzymes in industrial processes (See 
Faultman et a/ , Acid Proteases Structure Function and Biology, Tang. J ed Plenum Press, New York (1977) and 
Godfrey ct al, Industrial Enzymes, MacMillan Publishers. Surrey UK (1983) and Hepner et al., Report Industrial En- 

'5 zymes by 1990 Hel Hepner & Associates, London (1986)) 

Another class of commercially usable proteins of the present invention are the microbial lipases described by for 
instance. Macrae et a! , Philosophical Transactions of the Chiral Society of London 310 227 (1985) and Poserke, Jour- 
nal of the American Oil Chemist Society 61:1758 (1984) A major use of lipases is in the fat and oil industry for the 
production of neutral glycerides using lipase catalyzed mter-esterification of readily available triglycerides. Application 

20 of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the 
washing procedures. 

The use of enzymes, and in particular microbial enzymes as catalyst for key steps in the synthesis of complex 
organic molecules is gaining popularity at a great rate One area of great interest is the preparation of chiral interme- 
diates Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists 

25 involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors (See Davies et at ., Re- 
cent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Florida (1990)) 
The following reactions catalyzed by enzymes are of interest to organic chemists hydrolysis of carboxylic acid esters, 
phosphate esters, amides and nitrtles, esterification reactions, trans-estenfication reactions, synthesis of amides, re- 
duction of aikanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, 

30 and carbon bond forming reactions such as the aldol reaction 

When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation 
and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme Pros and cons of using a whole cell system on the one hand or an 
isolated partially purified enzyme on the other hand, has been described in detail by Bud ef al., Chemistry in Britain 

35 (1987), p 127 

Amino transferases, enzymes involved in the biosynthesis and metabolism of ammo acids, are useful in the catalytic 
production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L -amino acids and generally possess uniformly high catalytic 
rates A description of the use of ammo transferases for amino acid production is provided by Roselle-David, Methods 
■*o of Enzymoloqy 136:479 (1 987) 

Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in 
nucleic acid synthesis, repair, and recombination. A variety of commercially important enzymes have previously been 
isolated from members of Staphylococcus aureus These include Sau3A and Sau96l 

4 5 2. Generation of Antibodies 

As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety 
procedures and methods known in the art which are currently applied to other proteins The proteins of the present 
invention can further be used to generate an antibody which selectively binds the protein Such antibodies can be 
50 either monoclonal or polyclonal antibodies, as well fragments of these antibodies, and humanized forms 

The invention further provides antibodies which selectively bind to one of the proteins of the present invention and 
hybridomas which produce these antibodies A hybridoma is an immortalized cell line which is capable of secreting a 
specific monoclonal antibody 

In general, techniques for preparing polyclonal and monoclonal antibodies as well as hybridomas capable of pro- 
55 ducmg the desired antibody are well known in the art (Campbell, A M . MONOCLONAL ANTIBODY TECHNOLOGY 
LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY. Elsevier Science Publishers, Am- 
sterdam. The Netherlands (1984): St Groth et al . J Immunol. Methods35 1-21 (1980) Kohler and Milstein. Nature 
256 495-497 (1975)) the trioma technique the human B- cell hybridoma technique (Kozbor ct al.. Immunology Today 
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4 72 (1983), pgs. 77-96 of Cole et al , in MONOCLONAL ANTIBODIES AND CANCER THERAPY. Alan R. Liss. Inc. 
(1985)). 

Any animal (mouse, rabbit, etc.) which is known to produce antibodies can be immunized with the pseudogene 
polypeptide. Methods for immunization are well known in the art Such methods include subcutaneous or interperitoneal 
s injection of the polypeptide. One skilled in the art will recognize that the amount of the protein encoded by the ORF of 
the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the 
peptide and the site of injection 

The protein which is used as an immunogen may be modified or administered in an adjuvant in order to increase 
the protein's antigenicity Methods of increasing the antigenicity of a protein are well known in the art and include, but 
10 are not limited to coupling the antigen with a heterologous protein (such as globulin or galactosidase) or through the 
inclusion of an adjuvant during immunization 

For monoclonal antibodies spleen cells from the immunized animals are removed, fused with myeloma cells, such 
as SP2/0-Ag14 myeloma cells and allowed to become monoclonal antibody producing hybndoma cells 

Any one of a number of methods well known in the art can be used to identify the hybndoma cell which produces 
'5 an antibody with the desired characteristics These include screening the hybridomas with an ELISA assay western 
blot analysis, or radioimmunoassay (Lutz ct al , Exp Celt Res 1 75 : 109-124 (1988)) 

Hybridomas secreting the desired antibodies are c loned and the class and subclass is determined using procedures 
known in the art (Campbell A. M ., Monoclonal Antibody Technology Laboratory Techniques in Biochemistry and Mo- 
lecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1984)). 
20 Techniques described for the production of single chain antibodies (U S Patent 4,946.778) can be adapted to 

produce single chain antibodies to proteins of the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the immunized animal and is screened for 
the presence of antibodies with the desired specificity using one of the above-described procedures. 

The present invention further provides the above- described antibodies in detectably labelled form Antibodies can 
25 be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidm, etc.), enzymatic labels 
(such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine, etc.), 
paramagnetic atoms, etc Procedures for accomplishing such labelling are well-known in the art. for example see 
Sternberger et al, J Histochern Cytochem 18:315 (1970); Bayer, E A eta!, Meth Enzym 62:308 (1979): Engval, 
E et al., Immunol 109:129 (1972), Godmg, J. W J Immunol Meth 13 215 (1976)) 
30 The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells 

or tissues in which a fragment of the Staphylococcus aureus genome is expressed 

The present invention further provides the above -described antibodies immobilized on a solid support Examples 
of such solid supports include plastics such as polycarbonate complex carbohydrates such as agarose and sepharose. 
acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports 
35 are well known in the art (Weir, D M et al, "Handbook of Experimental Immunology" 4th Ed , Blackwefl Scientific 
Publications, Oxford, England. Chapter 10 (1986); Jacoby, W. D etal , Meth Enzym 34 Academic Press. N Y (1974)) 
The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for 
immunoaffinity purification of the proteins of the present invention. 

•to 3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression of one of the ORFs of the present in- 
vention, or homolog thereof, in a test sample, using one of the DFs,antigens or antibodies of the present invention. 
In detail, such methods comprise incubating a test sample with one or more of the antibodies, or one or more of 

15 the DFs. or one or more antigens of the present invention and assaying for binding of the DFs, antigens or antibodies 
to components within the test sample 

Conditions for incubating a DF. antigen or antibody with a test sample vary Incubation conditions depend on the 
format employed in the assay the detection methods employed, and the type and nature of the DF or antibody used 
in the assay One skilled in the art will recognize that any one of the commonly available hybridization, amplification 

50 or immunological assay formats can readily be adapted to employ the Dfs, antigens or antibodies of the present in- 
vention Examples of such assays can be found in Chard T, An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers. Amsterdam, The Netherlands (1986): Bullock, G R etal. Techniques in 
Immunocytochemistry Academic Press, Orlando. FL Vol 1 (1982) Vol 2 (1983). Vol 3 (1985); Tijssen. P. Practice 
and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry; PCT publication W095/32291, and 

55 Molecular Biology. Elsevier Science Publishers Amsterdam The Netherlands (1985) all of which are hereby incorpo- 
rated herein by reference. 

The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids 
such as sputum, blood, serum plasma, or urine The test sample used in the above-described method will vary based 
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on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed 
Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 

5 out the assays of the present invention 

Specifically, the invention provides a compartmentalized kit to receive in close confinement, one or more containers 
which compnses(a) a first container comprising one of the Dfs, antigens or antibodies of the present invention; and 
(b) one or more other containers comprising one or more of the following wash reagents reagents capable of detecting 
presence of a bound DF, antigen or antibody 

w In detail a compartmentalized kit includes any kit in which reagents are contained in separate containers Such 

containers include small glass containers, plastic containers or strips of plastic or paper Such containers allows one 
to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are 
not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another Such containers will include a container which will accept the test sample, a container which 

>5 contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tns-buffers, etc ), and containers which contain the reagents used to detect the bound antibody, antigen or DF 

Types of detection reagents include labelled nucleic acid probes, labelled secondary antibodies or in the alterna- 
tive if the primary antibody is labelled the enzymatic, or antibody binding reagents which are capable of reacting with 
the labelled antibody One skilled in the art will readily recognize that the disclosed Dfs. antigens and antibodies of the 

20 present invention can be readily incorporated into one of the established kit formats which are welt known in the art 

4. Screening Assay for Binding Agents 

Using the isolated proteins of the present invention the present invention further provides methods of obtaining 
25 and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Staphylococcus aureus fragment and contigs herein described 
in general, such methods comprise steps of 

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated 
30 fragment of the Staphylococcus aureus genome: and 

(b) determining whether the agent binds to said protein or said fragment 

The agents screened m the above assay can be, but are not limited to. peptides, carbohydrates, vitamin derivatives, 
or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed 
3S using protein modeling techniques 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected 
at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention 

Alternatively, agents may be rationally selected or designed. As used herein an agent is said to be "rationally 
selected or designed" when the agent is chosen based on the configuration of the particular protein For example, one 
•to skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, 
for example see Hurby et aL Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W. H. Freeman, NY (1 992), pp. 289-307, and Kaspczak etai, Biochemistry 28:9230-8 (1 989), or pharmaceutical 
agents, or the like 

■J5 In addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to 

control gene expression through binding to one of the ORFs or EMFs of the present invention As described above, 
such agents can be randomly screened or rationally designed/selected Targeting the ORF or EMF allows a skilled 
artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control 

50 One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by 

binding to DNA or RNA Such agents can be based on the classic phosphodiester ribonucleic acid backbone, or can 
be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity 

Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary 
to a region of the gene involved in transcription (triple helix - see Lee et al , Nucl Acids Res 6 3073 (1 979): Cooney 

55 ct al Science 241 456 (1988) and Dervan ct aL Science 251 : 1360 (1991)) or to the mRNA itself (antisense - Okano 
J Ncurochem 56 560 (1991). Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press Boca 
Raton. FL (1 988)) Triple helix-formation optimally results in a shut-off of RNA transcription from DNA while antisense 
RNA hybridization blocks translation of an mRNA molecule into polypeptide Both techniques have been demonstrated 
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to be effective in model systems Information contained in the sequences of the present invention can be used to design 
antisense and triple helix-forming oligonucleotides, and other DNA binding agents. 

5. Pharmaceutical Compositions and Vaccines 

5 

The present invention further provides pharmaceutical agents which can be used to modulate the growth or path- 
ogenicity of Staphylococcus aureus, or another related organism, in vivo or in vitro As used herein a "pharmaceutical 
agent" is defined as a composition ot matter which can be formulated using known techniques to provide a pharma- 
ceutical compositions As used herein the "pharmaceutical agents of the present invention" refers the pharmaceutical 

io agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are 
identified using the herein described assays 

As used herein, a pharmaceutical agent is said to "modulate the growth or pathogenicity of Staphylococcus aureus 
or a related organism in vivo or in vitro, " when the agent reduces the rate of growth rate of division, or viability of the 
organism in question The pharmaceutical agents of the present invention can modulate the growth or pathogenicity 

>5 of an organism in many fashions although an understanding of the underlying mechanism of action is not needed to 
practice the use of the pharmaceutical agents of the present invention Some agents will modulate the growth or path- 
ogenicity by binding to an important protein thus blocking the biological activity of the protein, while other agents may 
bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system Alternatively, the agent may comprise a protein encoded by one of the ORFs 

20 of the present invention and serve as a vaccine The development and use of vaccines derived from membrane asso- 
ciated polypeptides are well known in the art The inventors have identified particularly preferred immunogenic Sta- 
phylococcus aureus polypeptides for use as vaccines Such immunogenic polypeptides are described above and sum- 
marized in Table 4, below 

As used herein, a "related organism" is a broad term which refers to any organism whose growth or pathogenicity 
25 can be modulated by one of the pharmaceutical agents of the present invention In general, such an organism will 
contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine As 
such, related organisms do not need to be bacterial but may be fungal or viral pathogens 

The pharmaceutical agents and compositions of the present invention may be administered in a convenient man- 
ner such as by the oral, topical, intravenous intraperitoneal, intramuscular subcutaneous, intranasal or intradermal 
30 routes The pharmaceutical compositions are administered in an amount which is effective for treating and/or proph- 
ylaxis of the specific indication In general, they are administered tn an amount of at least about 1 mg/kg body weight 
and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day In most 
cases, the dosage is from about 0 1 mg/kg to about 10 g/kg body weight daily, taking into account the routes of ad- 
ministration, symptoms, etc 

3$ The agents of the present invention can be used in native form or can be modified to form a chemical derivative 

As used heroin, a molecule is saidto be a "chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility absorption, biological 
half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable 
side effect of the molecule, etc Moieties capable of mediating such effects are disclosed in. among other sources. 

■*o REMINGTON'S PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein 

For example, such moieties may change an immunological character of the functional derivative, such as affinity 
for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox or thermal stability, biological half- 
life, hydrophobicity. susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers 

■*5 also may be effected in this way and can be assayed by methods well known to the skilled artisan 

The therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient 
by any suitable means {e.g., inhalation, intravenously, intramuscularly, subcutaneously, enteraily, or parenterally) It is 
preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood 
or tissue in which the growth of the organism is to be controlled To achieve an effective blood concentration, the 

50 preferred method is to administer the agent by injection The administration may be by continuous infusion, or by single 
or multiple injections 

In providing a patient with one of the agents of the present invention, the dosage of the administered agent will 
vary depending upon such factors as the patient's age weight height sex, general medical condition, previous medical 
history etc In general, it is desirable to provide the recipient with a dosage of agent which is in the range of from about 
55 1 pc/kg to 10 mg/kg (body weight of patient) although a lower or higher dosage may be administered The therapeu- 
tically effective dose can be lowered by using combinations of the agents of the present invention or another agenl 

As used herein, two or more compounds or agents are said to be administered "in combination" with each other 
when cither (1) the physiological effects of each compound, or (2) the scrum concentrations of each compound can 
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be measured at the same time The composition of the present invention can be administered concurrently with prior 
to or following the administration of the other agent 

The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to 
decrease the rate of growth (as defined above) of the target organism 

s The administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose 

When provided prophylacticalty, the agent(s) are provided in advance of any symptoms indicative of the organisms 
growth The prophylactic administration of the agent(s) serves to prevent attenuate, or decrease the rate of onset of 
any subsequent infection When provided therapeutically the agent(s) are provided at (or shortly after) the onset of an 
indication of infection. The therapeutic administration of the compound(s) serves to attenuate the pathological symp- 

10 toms of the infection and to increase the rate of recovery 

The agents of the present invention are administered to a subject such as a mammat or a patient, in a pharma- 
ceutical^ acceptable form and in a therapeutically effective concentration A composition is said to be "pharmacolog- 
ically acceptable" if its administration can be tolerated by a recipient patient Such an agent is said to be administered 
in a "therapeutically effective amount" if the amount administered is physiologically significant An agent is physiolog- 

'5 ically significant if its presence results in a detectable change in the physiology of a recipient patient 

The agents of the present invention can be formulated according to known methods to prepare pharmaceutical^ 
useful compositions, whereby these materials, or their functional derivatives, are combined in admixture with a phar- 
maceutical^ acceptable carrier vehicle Suitable vehicles and their formulation inclusive of other human proteins c 
g , human serum albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 16 th Ed., 

20 Osol A.. Ed., Mack Publishing, Easton PA (1 980). In order to form a pharmaceutical^ acceptable composition suitable 
for effective administration, such compositions will contain an effective amount of one or more of the agents of the 
present invention, together with a suitable amount of carrier vehicle 

Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations 
may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention 

25 The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macro- 
molecules such as, for example, polyesters polyammo acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcel- 
lulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent 
in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired 
time course of release Another possible method to control the duration of action by controlled release preparations is 

30 to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyammo 
acids, hydrogels, poly(lactic acid) or ethylene vmylacetate copolymers Alternatively, instead of incorporating these 
agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by 
coacervation techniques or by mterfacial polymerization with for example, hydroxymethylcellulose or gelatine-micro- 
capsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example 

35 liposomes, albumin microspheres, microemulsions nanoparticles, and nanocapsules or in macroemulsions Such tech- 
niques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (1960) 

The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or 
more of the ingredients of the pharmaceutical compositions of the invention Associated with such container(s) can be 
a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals 

•*o or biological products, which notice reflects approval by the agency of manufacture, use or sale for human adminis- 
tration. 

In addition, the agents of the present invention may be employed in conjunction with other therapeutic compounds 
6. Shot-Gun Approach to Mega base DNA Sequencing 

45 

The present invention further demonstrates that a large sequence can be sequenced using a random shotgun 
approach This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating 
and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols 

Certain aspects of the present invention are described in greater detail in the examples that follow The examples 
50 are provided by way of illustration Other aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present disclosure 
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ILLUSTRATIVE EXAMPLES 
LIBRARIES AND SEQUENCING 
5 1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman 
(Landerman and Waterman. Genomics 2 231 ( 1 988)) application of the equation tor the Poisson distribution According 
to this treatment, the probability ; P 0 . that any given base in a sequence of si/e L in nucleotides is not sequenc ed after 
w a certain amount, n in nucleotides, of random sequence has been determined can be calculated by the equation P 0 
- e m ; where m is U'n the fold coverage M For instance, for a genome of 2 6 Mb. m-t when 2 8 Mb of sequence has 
been randomly generated (1X coverage) At that point P 0 - e _1 - 0 37 The probability that any given base has not 
been sequenced is the same as the probability that any region of the whole sequence L has not been determined and 
therefore is equivilent to the fraction of the whole sequence that has yet to be determined Thus at one fold coverage, 
'5 approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced When 14 Mb of sequence 
has been generated coverage is 5X for a 2 8 Mb and the unsequenced fraction drops to 0067 or 0 67% 5X coverage 
of a 2 8 Mb sequence can be attained by sequencing approximately 17.000 random clones from both insert ends with 
an average sequence read length of 410 bp 

Similarly the total gap length, G. is determined by the equation G = Le~ m and the average gap size, g follows the 
20 equation, g - L7n Thus. 5X coverage leaves about 240 gaps averaging about 82 bp in size in a sequence of a poly- 
nucleotide 2 8 Mb long. 

The treatment above is essentially that of Lander and Waterman. Genomics 2 231 (1988) 

2. Random Library Construction 

25 

In order to approximate the random model described above during actual sequencing, a nearly ideal library of 
cloned genomic fragments is required The following library construction procedure was developed to achieve this end 
Staphylococcus aureus DNA was prepared by phenol extraction A mixture containing 600 ug DNA in 3 3 ml of 
300 mM sodium acetate 10 mM Tris-HCI, 1 mM Na-EDTA 30% glycerol was sonicated for 1 mm at 0°C in a Branson 

30 Model 450 Sonicator at the lowest energy setting using a 3 mm probe The sonicated DNA was ethanol precipitated 
and redissolved in 500 ul TE buffer 

To create blunt-ends, a 1 00 ul aliquot of the resuspended DNA was digested with 5 units of BAL31 nuclease (New 
England BioLabs) for 10 mm at 30°C in 200 ul BAL31 buffer The digested DNA was phenol-extracted, ethanol-pre- 
cipitated, redissolved in 100 ul TE buffer, and then size-fractionated by electrophoresis through a 1 0% low melting 

35 temperature agarose gel The section containing DNA fragments 16-2 0 kb in size was excised from the gel, and the 
LGT agarose was melted and the resulting solution was extracted with phenol to separate the agarose from the DNA 
DNA was ethanol precipitated and redissolved in 20 ul of TE buffer for ligation to vector 

A two-step ligation procedure was used to produce a plasmid library with 97% inserts, of which >99% were single 
inserts. The first ligation mixture (50 ul) contained 2 ug of DNA fragments, 2 ug pUC1 8 DNA (Pharmacia) cut with Smal 

-to and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and was incubated 
at 14°C for 4 hr The ligation mixture then was phenol extracted and ethanol precipitated, and the precipitated DNA 
was dissolved in 20 ul TE buffer and efectrophoresed on a 1 .0% low melting agarose gel. Discrete bands in a ladder 
were visualized by ethidium bromide-staining and UV illumination and identified by size as insert (i). vector (v). v+i. 
v+2r v+3i, etc. The portion of the gel containing v+i DNA was excised and the v+i DNA was recovered and resuspended 

■ts into 20 ul TE The v+i DNA then was blunt-ended by T4 polymerase treatment for 5 mm at 37° C in a reaction mixture 
(50 ul) containing the v+i linears, 500 uM each of the 4 dNTPs. and 9 units of T4 polymerase (New England BioLabs), 
under recommended buffer conditions After phenol extraction and ethanol precipitation the repaired v+i linears were 
dissolved in 20 ul TE The final ligation to produce circles was carried out in a 50 ul reaction containing 5 ul of v+i 
linears and 5 units of T4 ligase at 14°C overnight After 10 mm at 70°C the following day, the reaction mixture was 

so stored at -20°C 

This two-stage procedure resulted in a molecularly random collection of single-insert plasmid recombinants with 
minimal contamination from double-insert chimeras (<1%) or free vector (<3%) 

Since deviation from randomness can arise from propagation the DNA in the host. E coil host cells deficient in all 
recombination and restriction functions (A Greener, Strategies 3(1)5(1 990)) were used to prevent rearrangements, 
55 deletions and loss of clones by restriction Furthermore, transformed cells were plated directly on antibiotic diffusion 
plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells 

Plating was carried out as follows A 100 ul aliquot of Epicurian Colt SURE II Supercompetent Cells (Stratagene 
200152) was thawed on ice and transferred to a chilled Falcon 2059 tube on ice A 1 7 ul aliquot of 1 42 M beta- 
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mercaptoethanol was added to the aliquot of cells to a final concentration of 25 mM Cells were incubated on ice for 
10 mtn A 1 ul aliquot of the final ligation was added to the cells and incubated on ice for 30 mm The cells were heat 
pulsed for 30 sec at 42 3 C and placed back on ice for 2 mm The outgrowth period in liquid culture was eliminated 
from this protocol in order to minimize the preferential growth of any given transformed cell Instead the transformation 

5 mixture was plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar 
20 g tryptone 5 g yeast extract, 0 5 g NaCL 1 5% Difco Agar per liter of media) The 5 ml bottom layer is supplemented 
with 0 4 ml of 50 mg/ml ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml 
X-Gal (2°o) 1 ml MgCI 2 (1 M) and 1 ml MgSO 4 /l00 ml SOB agar The 15 ml top layer was poured just prior to plating 
Our titer was approximately 100 colonies/10 ul aliquot of transformation 

w All colonies were picked for template preparation regardless of si/e Thus, only clones lost due to "poison" DNA 

or deleterious gene products would be deleted from the library, resulting in a slight increase in gap number over that 
expected 

3. Random DNA Sequencing 

15 

High quality double stranded DNA plasmid templates were prepared using an alkaline lysis method developed in 
collaboration with SPrime > 3Pnme Inc (Boulder, CO). Plasmid preparation was performed in a 96-well format for all 
stages of DNA preparation from bacterial growth through final DNA purification Average template concentration was 
determined by running 25% of the samples on an agarose gel DNA concentrations were not adjusted 

20 Templates were also prepared from a Staphylococcus aureus lambda genomic library An unamplified library was 

constructed in Lambda DASH II vector (Stratagene) Staphylococcus aureus DNA (> 100 kb) was partially digested in 
a reaction mixture (200 ul) containing 50 ug DNA, 1X Sau3AI buffer 20 units Sau3AI for 6 min. at 23 C. The digested 
DNA was phenol-extracted and centrifuges over a 10- 40% sucroce gradient. Fractions containing genomic DNA of 
15-25 kb were recovered by precipitation One ul of fragments was used with 1 ul of DASHII vector (Stratagene) in 

25 the recommended ligation reaction One ul of the ligation mixture was used per packaging reaction following the rec- 
ommended protocol with the Gigapack II XL Packaging Extract Phage were plated directly without amplification from 
the packaging mixture (after dilution with 500 ul of recommended SM buffer and chloroform treatment) Yield was about 
2.5x1 0 9 pfu/ul 

An amplified library was prepared from the primary packaging mixture according to the manufactureer's protocol 
30 The amplified library is stored frozen m 7%, dimethylsulfoxide The phage titer is approximately 1 x1 0 9 pfu/ml 

Mini-liquid lysates (0.1 ul) are prepared from randomly selected plaques and template is prepared by long range 
PCR. Samples are PCR amplified using modified T3 and T7 primers, and Elongase Supermix (LTI) 

Sequencing reactions are carried out on plasmid templates using a combination of two workstations (BIOMEK 
1000 and Hamilton Microlab 2200) and the Perkin-Elmer 9600 thermocycler with Applied Biosystems PRISM Ready 
35 Reaction Dye Primer Cycle Sequencing Kits for the M1 3 forward (M1 3-21 ) and the M1 3 reverse (M1 3RP1 ) primers 
Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler 
using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits Modified T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are on a combination 
of AB 373 DNA Sequencers and ABI 377 DNA sequencers All of the dye terminator sequencing reactions are analyzed 
JO using the 2X 9 hour module on the AB 377 Dye primer reactions are analyzed on a combination of ABI 373 and ABI 
377 DNA sequencers. The overall sequencing success rate very approximately is about 85% for M1 3-21 and M1 3RP1 
sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for M13-21 sequences 
445bp for M13RP1 sequences, and 375 bp for dye-terminator reactions 

^5 4. Protocol for Automated Cycle Sequencing 

The sequencing was carried out using Hamilton Microstation 2200, Perkin Elmer 9600 thermocyclers, ABI 373 
and ABI 377 Automated DNA Sequencers The Hamilton combines pre-aliquoted templates and reaction mixes con- 
sisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescently-labelled sequencing 
50 primers and reaction buffer Reaction mixes and templates were combined in the wells of a 96-well thermocycling 
plate and transferred to the Perkin Elmer 9600 thermocycler Thirty consecutive cycles of linear amplification (i.e one 
primer synthesis) steps were performed including denaturation annealing of primer and template, and extension, i e , 
DNA synthesis A heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for 
an oil overlay 

55 Two sequencing protocols were used: one for dye-labelled primers and a second for dye-labelled dideoxy chain 

terminators The shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four 
terminator nucleotide Each dye-primer was labelled with a different fluorescent dye, permitting the four individual 
reactions to be combined into one lane of the 373 or 377 DNA Sequencer for electrophoresis detection and base- 
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calling ABI currently supplies premixed reaction mixes in bulk packages containing all the necessary non-template 
reagents for sequencing Sequencing can be done with both plasmid and PCR-generated templates with both dye- 
primers and dye- terminators with approximately equal fidelity although plasmid templates generally give longer usable 
sequences 

5 Thirty-two reactions were loaded per ABI 373 Sequencer each day and 96 samples can be loaded on an ABI 377 

per day Electrophoresis was run overnight (ABI 373) or for 2 1 12 hours (ABI 377) following the manufacturer's protocols. 
Following electrophoresis and fluorescence detection the ABI 373 or ABI 377 performs automatic lane tracking and 
base-calling The lane-tracking was confirmed visually Each sequence electropherogram (or fluorescence lane trace) 
was inspected visually and assessed for quality Trailing sequences of low quality were removed and the sequence 

10 itself was loaded via software to a Sybase database (archived daily to 8mm tape) Leading vector polylmker sequence 
was removed automatically by a software program Average edited lengths of sequences from the standard ABI 373 
or ABI 377 were around 400 bp and depend mostly on the quality of the template used for the sequencing reaction 

INFORMATICS 

15 

1. Data Management 

A number of information management systems for a large-scale sequencing lab have been developed (For review 
see for instance Kerlavage era/., Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System 

20 Sciences, IEEE Computer Society Press Washington D. C , 585 (1993)) The system used to collect and assemble 
the sequence data was developed using the Sybase relational database management system and was designed to 
automate data flow whereever possible and to reduce user error. The database stores and correlates all information 
collected during the entire operation from template preparation to final analysts of the genome. Because the raw output 
of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based 

25 on a Unix platform, it was necessary to design and implement a variety of multi- user, client-server applications which 
allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort 

2. Assembly 

30 An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence 

fragments was enployed to generate contigs The TIGR assembler simultaneously clusters and assembles fragments 
of the genome In order to obtain the speed necessary to assemble more than 10 4 fragments, the algorithm builds a 
hash table of 12 bp oligonucleotide subsequences to generate a list of potential sequence fragment overlaps The 
number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements 

35 Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add 
the best matching fragment based on oligonucleotide content The contig and candidate fragment are aligned using a 
modified version of the Smith-Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S , 
Methods in Enzymoiopy 164 : 765 (1988)). The contig is extended by the fragment only if strict criteria for the quality 
of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched 

io end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal 
coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment 
determines which fragments are likely to fall into repetitive elements Fragments representing the boundaries of repet- 
itive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of align- 
ments and excluded from the current contig TIGR Assembler is designed to take advantage of clone size information 

is coupled with sequencing from both ends of each template It enforces the constraint that sequence fragments from 
two ends of the same template point toward one another in the contig and are located within a certain ranged of base 
pairs (definable for each clone based on the known clone size range for a given library) 

3. Identifying Genes 

50 

The predicted coding regions of the Staphylococcus aureus genome were initially defined with the program zorf, 
which finds ORFs of a minimum length The predicted coding region sequences were used in searches against a 
database of all Staphylococcus aureus nucleotide sequences from GenBank (release 92 0) using the BLASTN search 
method to identify overlaps of 50 or more nucleotides with at least a 95% identity Those ORFs with nucleotide sequence 
55 matches are shown in Table 1 The ORFs without such matches were translated to protein sequences and and com- 
pared to a non-redundant database of known proteins generated by combining the Swiss-prot PIR and GenPept 
databases ORFs of at least 80 ammo acids that matched a database protein with BLASTP probability less than or 
equal to 0 01 are shown in Table 2 The table also lists assigned functions based on the closest match in the databases 
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ORFs of at least 120 amino acids that did not match protein or nucleotide sequences in the databases at these levels 
are shown in Table 3 

ILLUSTRATIVE APPLICATIONS 

5 

1. Production of an Antibody to a Staphylococcus aureus Protein 

Substantially pure protein or polypeptide is isolated trom the transfected or transformed cells using any one of the 
methods known in the art The protein can also be produced in a recombinant prokaryotic expression system such as 
io E coif, or can by chemically synthesized Concentration of protein in the final preparation is adjusted for example by 
concentration on an Amicon filter device to the level of a few micrograms/ml Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows 

2. Monoclonal Antibody Production by Hybridoma Fusion 

15 

Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from 
murine hybridomas according to the classical method of Kohler. G. and Milstem, C. Nature 256:495 (1 975) or modifi- 
cations of the methods thereof Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein 
over a period of a few weeks The mouse is then sacrificed, and the antibody producing cells of the spleen isolated 

20 The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells 
destroyed by growth of the system on selective media comprising aminoptenn (HAT media). The successfully fused 
cells are diluted and aiiquots of the dilution placed in wells of a microtrter plate where growth of the culture is continued 
Antibody -producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay 
procedures, such as ELISA, as originally described by Engvall. E.. Meth Enzymol 70 419 (1980), and modified metrv 

25 ods thereof Selected positive clones can be expanded and their monoclonal antibody product harvested for use 
Detailed procedures for monoclonal antibody production are described in Davis L et al Basic Methods in Molecular 
Biology Elsevier. New York Section 21-2 (1989) 

3. Polyclonal Antibody Production by Immunization 

30 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by im- 
munizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance 
immunogenicity Effective polyclonal antibody production is affected by many factors related both to the antigen and 
the host species For example, small molecules tend to be less immunogenic than other and may require the use of 

35 carriers and adjuvant Also, host animals vary in response to site of inoculations and dose, with both inadequate or 
excessive doses of antigen resulting in low titer antisera Small doses (ng level) of antigenadmmistered at multiple 
intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, 
J. ctal, J Clin Endocrinol Metab 33 988-991 (1971) 

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof as de- 

J0 termined semi-quantitatively, for example by double immunodiffusion in agar against known concentrations of the 
antigen, begins to fall. See. for example. Ouchterlony, O. et al , Chap. 19 in: Handbook of Experimental Immunology, 
Wier, D., ed, Blackwell (1 973). Plateau concentration of antibody is usually in the range of 0. 1 to 0. 2 mg/ml of serum 
(about 1 2M) Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, 
for example, by Fisher. D . Chap. 42 m Manual of Clinical Immunology, second edition, Rose and Friedman, eds , Amer. 

J5 Soc For Microbiology Washington D C (1980) 

Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which de- 
termine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively 
or qualitatively to identify the presence of antigen in a biological sample In addition, they are useful in various animal 
models of Staphylococcal disease known to those of skill in the art as a means of evaluating the protein used to make 

50 the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunothereapeutic 
reagent 

3. Preparation of PCR Primers and Amplification of DNA 

55 Various fragments of the Staphylococcus aureus genome, such as those of Tables 1 -3 and SEQ ID NOS: 1-5.191 

can be used in accordance with the present invention, to prepare PCR primers for a variety of uses The PCR primers 
are preferably at least 15 bases, and more preferably at least 18 bases in length When selecting a primer sequence 
it is preferred that the primer pairs have approximately the same G'C ratio, so that melting temperatures are approxi- 
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mately the same The PCR primers and amplified DNA of this Example find use in the Examples that follow 
4. Gene expression from DNA Sequences Corresponding to ORFs 

5 A fragment of the Staphylococcus aureus genome provided in Tables 1 -3 is introduced into an expression vector 

using conventional technology Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well known in the art Commercially avail- 
able vectors and expression systems are available from a variety of suppliers including Stratagene (La Jo!la ; California). 
Promega (Madison Wisconsin), and Invitrogen (San Diego, California) If desired, to enhance expression and facilitate 

w proper protein folding, the codon context and codon pairing of the sequence may be optimized for the particular ex- 
pression organism as explained by Hatfield et al , U S Patent No 5 082,767. incorporated herein by this reference 
The following is provided as one exemplary method to generate polypeptide(s) from cloned ORFs of the Staphy- 
lococcus aureus genome fragment Bacterial ORFs generally lack a poly A addition signal The addition signal sequence 
can be added to the construct by, for example splicing out the poly A addition sequence from pSG5 (Stratagene) using 

'5 Bgll and Sail restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXT1 (Strat- 
agene) for use in eukaryotic expression systems pXT1 contains the LTRs and a portion of the gag gene of Moloney 
Murine Leukemia Virus The positions of theLTRs in the construct allow efficient stable transfection. The vector includes 
the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene The Staphylococcus aureus DNA 
ts obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Staphylococcus 

20 aureus DNA and containing restriction endonuclease sequences for Pstl incorporated into the 5' primer and Bgll I at 
the 5' end of the corresponding Staphylococcus aureus DNA 3' primer, taking care to ensure that the Staphylococcus 
aureus DNA is positioned such that its followed with the poly A addition sequence The purified fragment obtained from 
the resulting PCR reaction is digested with Pstl, blunt ended with an exonuclease. digested with Bglll, purified and 
ligated to pXT1 , now containing a poly A addition sequence and digested Bglll 

2S The ligated product is transfected into mouse NIH 3T3 cells using Lipofectm (Life Technologies, Inc., Grand Island, 

New York) under conditions outlined in the product specification Positive transfectants are selected after growing the 
transfected cells in 600 ug/ml G41 8 (Sigma, St Louis. Missouri) The protein is preferably released into the supernatant 
However if the protein has membrane binding domains the protein may additionally be retained within the cell or 
expression may be restricted to the cell surface Since it may be necessary to purify and locate the transfected product, 

30 synthetic 15-mer peptides synthesized from the predicted Staphylococcus aureus DNA sequence are injected into 
mice to generate antibody to the polypeptide encoded by the Staphylococcus aureus DNA 

Alternately and if antibody production is not possible, the Staphylococcus aureus DNA sequence is additionally 
incorporated into eukaryotic expression vectors and expressed as, for example, a globm fusion Antibody to the globin 
moiety then is used to purify the chimeric protein Corresponding protease cleavage sites are engineered between the 

35 globin moiety and the polypeptide encoded by the Staphylococcus aureus DNA so that the latter may be freed from 
the formed by simple protease digestion One useful expression vector for generating globin chimerics is pSG5 (Strat- 
agene). This vector encodes a rabbit globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 
transcript and the polyadenylation signal incorporated into the construct increases the level of expression. These 
techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods 

•to texts such as Davis ct al. , cited elsewhere herein, and many of the methods are available from the technical assistance 
representatives from Stratagene. Life Technologies, Inc., or Promega. Polypeptides of the invention also may be pro- 
duced using in vitro translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes of clarity and understanding, one 
skilled in the art will appreciate that various changes in form and detail can be made without departing from the true 

4 5 scope of the invention 

All patents, patent applications and publications referred to above are hereby incorporated by reference 
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911 0 on 
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ProX 
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hypothetical protein 
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187-198 


244-261 


268-278 


308-317 


_ ._ 3IO = 8„ 


131-140 


144-153 


177-186 


190-199 


_ 204^213 


216-227 


601.1 


208-218 












544.3 


170-179 


184-193 


■ 224-235 


274-287 


327-336 


352-361 


662_1 














87.7 














120.1 








i 
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Table 4 



ORF 




Antigenic Regions 




(cont) 








Region 1 1 


Region 1 2 


Region 13 




Region 14 


Region 15 


Region 1 6 


168.16 
















238_1 
















5 --2 
















278_3 
















276_2 
















45_4 
















_ .31 5^8 
















154_15 








— i 





i 




228.3 
















228.6 I . : _ 














50.1 












i 




1 1 2_7 












; 1 


442.1 I 








i J 




66_2 






\ i 


304_2 








: ! 


" 44_1 












i 





161_4 






i 




46_5 


306-315 




i 




942.1 






i 




5_4 


393-407 7 


416-426 


456-465 










20_4 


396-405 : 


410-419 ' 


461-481 


1 






328_2 




1 1 




520_2 




1 ' 
1 i 




771.1 






i— ! - - 


999_1 ! 




853_1 i _ 












287.1 
















288.2 












... 1 




596.2 












; 




217_5 








— i 


i - 






21 7_6 
















528.3 






i _ _____ _. _ 




i 


171.1 1 






1 | 


63_4 






i i 


353_2 ! 


1 : 1 


743-.1 
















342.4 






1 










69_3 






~ 










70_6 


453-471 


506-515 












T 29_2 


296-31 5 














58_5 

















188_3 
















236_6 


~358^377 


410-423 


428-439 




442-457 


467-476 


480-493 


31 0_8 


238-251 


256-275 


281-290 




296-310 


314-333 


3^8-347 


601_1 


! 


544_3 
















662.1 
































120_T 
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Table 4 



ORF 


Antigenic 


Regions 


(cont) 


Region 1 7 


Region 18 


Region 19 


Region 20 Region 21 Region 22 


163_6 








238„1 








51_2 








278_3 








276_2 








45_4 








3I6_8 
154.J5 ■ 








228_3 








228.6 






i. 


i I 


50_1 


— 




l 


112_7 1 






442_1 ; 






66_2 




304,2 i 






44_1 








1 61 _4 i 








46_5 I 


I 


i > 


942.1 








5_4 I 








20_4 






! ' 


328_2 i 




i 


1 1 ; 


520_2 I 








771_1 ! 








999_1 








853,1 ! 








287_1 i 








288.2 1 






I 


596_2 j 








217_5 






1 


217_6 








528_3 ! 


1 




171,11 








63 4 1 ! 


353_2 ! 






1 


743-1 


342_4 ! 








69_3 








70_6 








! 129_2 


i 58 5 


188_3 








i 236_6 








._3JCL8_ 357-366 
601 _T 


370-379 


429-438 


443-452 478-487 551-560 


r~~544_3 








! 662_1 _ 








; B7~_7 
120_1 
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Table 4 



vjr\r 


MnLiyenii. 


Regions 


(mnt\ 






Region 23 


Region 24 


Region 25 


fteyiun co 


Rpninn ?7 


Region 28 


163.6 : 








. 




238_1 












51 _2 










■ - - 


278.3 










- 


276.2 










- - 


45_4 ; 










■ - - 


316.8 ; 










.. ... 


154_15 : 













228.3 










— 


228.6 1 ! 








. 50.1 i 










■ 


112.7 i 












442.1 ! , I 







66.2 I 


l 




304_2 1 j 


l 




44 1 I 








161.4 




. i 


_ __. . 






46_5 












942_1 i 














i 




20_4 ! 






L _ . .. - - 






328.2 ! ! i ! 




520.2 i 


I 








771.1 












999.1 ' ! 






853_1 1 1 










287_1 i 










..... 


288_2 












596.2 












21 7_5 












217_6 










528_3 












171 11 i 




63 4 ' 1 




VS3 2 1 ! 




743 1 1 


342 4 ! 


69 3 1 


70_6 












129_2 












58_5 












138_3 












236_6 












310_8 622-632 


670-685 


708-718 


823-836 


858-867 


877-886 


601_1 












544 3 


662_1 












87.7 












" 120.1 
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Table 4 



ORF Antigenic 


Regions 


(cont) 


Region 29 

" 168_6 


Region_30_ 





238_1 






51_2 






278_3 






276.2" ■ 






45.4 






316_3 






1 54.15 






228_3 






228_6 






50.1 







1 1 2_7 






442_1 




66.2 




304_2 






44.1 






161.4 1 


46.5 ! 


942_1 







5_4 ] 




20_4 






328.2 






520_2 






771.1 ' 


999_1 






853.1 






Z87_l 






288_2 






596_2 






217_5 






217.6 






528_3 






171.11 ! 


63_4 1 , i 


353.2 - 1 | 


_?4£_1 ' | 


342.4 i ! 


69.3 1 


70.6 


129.2 






58.5 






188.3 






236.6 


310.8 






601.1 






544.3 






662.1 






87_7 






120.1 
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I aole 4 







BL^ST 

HOMOLOG 


Antigenic 
Region 1 


Regions 

Region 2 






Region 3 


Region 4 




""5241 


aldehyde dehydrogenase 


8-17 


36-52 






"""83-96""" 




63_4 


_.524? 


glycerol ester hydrolase (P. 


9 26 


57-73"" 






93-107 


1~23-1 33 






5243 ketopantoate hydroxymeth 




203:21.2 






242-2S4 


265-274 


206_1 6 5244 


ornithine acetyltransferase 


1-10 


" 34-43 






54-63 " 


■ 194-210 


267_1 


5245 


NaH-antiporter protein (E. r 


120-129 


3 32Jt47_ 






398-408 




322_1 


5246 


acri flavin resistance 'protein 


58-75 


"153-164 






203-231 


"1 264^284 


415_2 


5247 


transport ATP binding protf 


108-126 


218-227 






298-308 


, 315-334 


214_3 


5243 


2-nitropropane dioxygenase 


123-136 


216-233 






283-292 


J 297-306" 


587_3 


5249 


clumping factor 


5-14 


43-54 




59-68 


j 76-95 


685_1* 


~5250 


signal peptidase 


59-68 


12; 8.1 _ 






86-95 


t 99-108 


54_3 


5251 


fibronectin binding protein 1 


23-32 


37-46 


T 


50-59 


89-98 


54_4 


5252 


fibronectm binding protein 1 


_ 43-52___ 


66-75 






__95 : 104 


; . 147 -\. 5 J6 


54l5 


5253 


fibronectin binding protein 1 


49-60 


81-90 










54.6 


,5254 


fibronectin binding protein 1 


55 71 


82-97 






~T390 58 


j 175 : 186* 


323_1 


" 5255* 


lipoprotein (H. flu) 


11-20 


61-70 






" 96-105 


i 



Table 4 



ORF 




Antigenic; 


Regions 


(cont) i 








Region 5 


Region 6 


Region 7 1 


Region 8 ! 


Region 9 


Region 10 


46_1 


215-242 


333-352 1 


376-385 I 


416-432 ' 


471-487 " : 




63_4 


145-154 


J?±202__: 


212-223 


245-265 


274-283 


291-300 


174_6 














206_16 


239-259 


275 : 284_ ; 










267_1 















322,1 


298-319 


" 3S(>359 










41S_2 


_3_44 : 353 


" 3 71-380 " 


39 5^4_0_4__ 


456-465 


~~ 486-495". . 


51 8-52>" 


214_3 


3~1 8-3*3/ __ 


365-375 










" 537_3 


106-115 


142-151 


156-166 


173-182 


186-198 


204-213 


685_1 


113-122__ 


130-145 _ 










54_3 


128-138 


1 85 : 194 


217-226 


251-260 


268-277 


~ 295-305 


54_4 


175-188 


191-200 


203-212 


220-229 






~54_5 1 












54_6 i 


220-230 


287-304 


317-326 


344-353 


364-373 


_37_8^387_. 


328 1 ! t 
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Table 4 



ORF 




Antiaenic 


Regions 




{com) 








Region 1 1 


Region 1 2 


Region 1 3 




^Region 1 4 


Region 1 5 


Region 17 


46_1 














63_4 


306-31 5 


319-328 


366-376 






^453;462 


4 67-476 


T74_6 






_ _ _._ _ 








. . . _ _ 


C\J\J l o 








J 






267_1 




7 


_ ... . — 






1 




322_1 
















41 5_2 


" "539-555 














214_3 
















587_3 


~2l7^226 




_3] 8 : 327_ 


1332-342 


i351-360 




685_1 








T 




1 


54l3" 


" 31 6-325 


L"jT9-345~^ 


" 35 5-372~" 




387-396 


41_6-425 ~ 


4 38 : 448 


54_4 
















54_5 




! i 








i 569-578 




54_6 


; 396-407 


i_427i436_] 


514-531 




541-550 


! 61 2-62 Z 


328~_1 












1 


i 



Table 4 



ORF 




Antigenic 


Regions 


(cont) 








Region 1 8 


Region 19 


Region_20 


Regional 


Region 22 


Region 23 


46_1 














63_4 


485-500 


"51 3-525" 










1 74_6 






r 






206_16 














267_1 














322_1 
415_2 


i 












214_3 


i 








1 




587_3 


" : 396-405 


"\AZS~AAZ 


459-4 70 


485-494 


505-514 


531-562 


685„1 


n ■ — 


L 






J_ „ 




54,3 


~ r 455-462 


472-491 


517-536 








54_4 










I 





54_5 


i 








I 




54_6 


i 639-648 


'673-681 


i 703-7 15 


723-732 


_: 749-760 


772-788 


328 1 ! 1 
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Table 4 



ORr 




Antigenic 


Regions 


(cont) 






46_1 


Region 24 


Region 25 




Region 26 




Region_27_ 


Region 28 




Region 29 

-■ 


63_4 


— — - ~ - - 












174_6 














206_1 6 














267.J 














322_1 
41 5_2 " 
214_3 



























587_3 

685_1 


567-578 


584-601 


607-840 


844-854 


858-870 


*877~886" 


"~54_3 
54_4 












1 


54_5 












_.l . . . . 


54_6 


793-802 


811-826 


834-848 


866-876 


893-903 


j907-918 


328'H 








r 
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25 



ORF 


Antigenic 


Regions (cont) 


46_1 


Region 30 


Region 31 


63,4 






174_6 






206_~16 


| 




267.1 " 


i 
i 




322_1 






415_2 






214_3 


i 




587_3 


889-91 1 


927-936 ! 


685_1 


1 


! 


54_3 




i 


54_4 


_!_._. 




54_5 


! 




54_6 


925-944 


?1 i3J T 7| I~ ~ 


328_1 


! 
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SEQUENCE LISTING 



w 



20 



30 



35 



;i) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY : US 

(F) POSTAL CODE: 2 08 50 

(ii) TITLE OF INVENTION: Staphylococcus aureus Poly- 
nucleotides and Sequences 

(iii) NUMBER OF SEQUENCES: 5255 



(v) COMPUTER READABLE FORM: 
25 ( A ) MEDIUM TYPE: Diskette, 3.50 inch, 1 . 4 Mb storage 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/009,861 

(B) FILING DATE: 05-JAN-1996 



(2) INFORMATION FOR SEQ ID NO : 1 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5895 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

10 

TCCATTATGA AGTCACAAGT ACTATAAGCT GCGATGTTAC CAATGTTTTT TAAAATCCCA 60 

GTAATAAAAT CAAAAAATAA GTTAAATAAT GTATTCATTT TAAGTCCTCC TTAATAAAGa 120 

J5 aaataGGTAA TAATGTAATA GCTTCTATTA TGATGCCTAA TTGAATGAAT TGGGCAAATG 180 

GCTCTTTGAT GATAAGTGTG ATAATGAAAA GGGTTAAACT AACAATAATC GCATAATATT 24 0 

TTTTTCGTTT AATAAGTCGC A CAGG AATGG GCTTCTTTTT AGTTGCTGCA GG AG CAT AT A 3 00 

20 CTGAGATTAC ACCTAAAGAA ATAACTGTTA AAATAATCAT AATTAAAAAG TTAATATGAA 3 60 

AATTTACTAT TACTAAAGGT AAAAGTATAA ATAGTATAAT ACTTTCTACA TAACACCAAA 4 20 

AAGAAGAAGG TGCATGTGCa CCATGTGCAT GtCTTCTTAT TAAATAAAAT GTTAAATTCG 4 80 

25 

TAATTAACGT AAACAGAAAA ATGTTTAAAA TATAGGCAAT AGTATACATA ACAATTAATT 54 0 

TACCTATATT TTTAGCTAAG ACCTGCATCC CTAATCGTAC TTGCAAAAAT TGAATATGAT 6 00 

CTAAGTTATT TCTCTTTTGA AGATACGTGG CAAACTGGTC AATTTTATTA TCAAAATAAT 66 0 

30 

TCAATTTTAC ACCACTCTCC TCACTGTCAT TATACGATTT AGTACAATCT IT TAT C ATT A 72 0 

TATTGCCTAA CTGTAGGAAA TAAATACTTA ACTGTTAAAT GTAATTTGTA TTTAATATTT 78 0 

35 TAACATAAAA AAATTTACAG TTAAGAATAA AAAACGACTA GTTAAGAAAA ATTGGAAAAT 84 0 

AAATGCTTTT AGCATGTTTT AATATAACTA GATCACAGAG ATGTGATGGA AAATAGTTGA 90 0 

TGAGTTGTTT AATTTTAAGA ATTTTTATCT TAATTAAGGA AGGAGTGATT TCAATGGCAC 960 

40 

AAGATATCAT TTCAACAATC GGTGACTTAG TAAAATGGAT TATCGACACA GTGAACAAAT 102 0 

TCACTAAAAA ATAAGATGAA TAATTAATTA CTTTCATTGT AAATTTGTTA TCTTCGTATA 108 0 

GTACTAAAAG TATGAGTTAT TAAGCCATCC CAACTTAATA ACCATGTAAA ATTAGCAAGT 114 0 

45 

GAGTAACATT TGCTAGTAGA GTTAGTTTCC TTGGACTCAG TGCTATGTAT TTTTCTTAAT 1200 

TATCATTACA GATAATTATT TCTAGCATGT AAGCTATCGT AAA CAA CATC GATTTATCAT 126 0 

SO TATTTGATAA AT AAAATT TT TTTCATAATT AATAACATCC CCAAAAATAG ATTGAAAAAA 132 0 

TAACTGTAAA ACATTCCCTT AATAATAAGT ATGGTCGTGA GCCCCTCCCA AGCTCGCGGC 13 80 

CTTTTTTGTA ATGAAGAAGG GATGAGTTAA TCATCATTAT GAGACCCGCC GTTAAAATAT 144 0 
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TCATTTGCAA AGGGCGAAAT GGGTTCTTAC TGAGTTATCT ATTATAAAAA AATAAACATA 15 6C 

GACTTATGAA AAATCTCTCA TAAATCTATG TTTAGTCATG aCATGTGTTA AATATTATTT 162 0 

CGGGCGCTTC TTATTTATAC AAATCTAATT TAATACTTTT AAATACAGGT ATATTTTCgC 16 8 C 

GTTGCTGTTC TACTTCATTT AAG7TTAAAT CTACAGTCAA AATATCTGCG GATTCATTTA 174 0 

ATTCTCCAAC TAAATCTCCA TTTGGGTTTA TAACTATCGA ATGACCAGCA TATTCTGTGT 18 00 

W 

TACCATCGAA TCCAGTGCTA TTAGTTCCAA TGACAAACAT ATTATTTTCA ATTGCACGTG I86 0 

CCTTTAGTAA TGAATGCCAA TGTTGAAGAC GTGACATAGG CCATTGCGCC ACATAAAATG 1920 

is CAATTTTAGC ACCACTACGA GCAGG AT ATC TTAATAATTC TGGAAAACGT AAATCATAAC 19 80 

AGATAAGTTG GGTCACATAA GTACCGTCAG A CAATTG AAA GGGTTCAGCT ACGTATTCGC 2040 

CAGCGGTTAA AAATTCATGC TCTCTTAACA TAGGAACTAA ATGAACTTTG TCGTATTCaT 2100 

20 TAATCAGCTG GCCACTTTTA TTCACACTAA AAGCTGTATT AAATATTTGA TTGTTTCTAA 2160 

TGTTAGAAAC TGACCCAGCT ACGATATCGA CTTTATATTT TTCAGCTAAA TGTTTAATAA 2220 

ATGAAAAACT TTGTCCTAGA TTATTATCTG CTTTTTCATT TAAATGCTCT AAATCATAGC 2280 

25 

C ATT ATT CCA CATTTCAGGT AAAACGACTA CATCTACTTC AGCATTCATA TTTTTTTCGA 2 340 

ACCATTGCGT TATTTGAGTT TCATTTTTAG AACTATCTCC AAAAACAATC GGTAATTGAT 24 00 

AAATTTGGAC TTTCATAACA TCACATCCTT GATAGATCTT ATATATAACT TACTAAAAGT 2460 

TATGTTGAAA CGCAAAAAAC GAG CACAAGA CATAAAATCA AAGTCCTAGG CTCTACAAAG 2 520 

TTATATTGAC AGTAGTTGAT GGGGCCCCAA CATAGAGAAA TTGGAACACC AATTTCTACA 2 580 

35 GACAATGCAA GTTGGGGTGG GCTCTAACAT AAAGAAATAC TTTTTCTTTA GAAATTAGTA 2 640 

TTTCTTATAC ATGAGTTTTA CTCATGTATT CCTATTCTTA AGTGCACATT AGCAGCGGCT 2 7 00 

AATGTGTAAG AACTACTACA TAATGAATAA CTAATGATTC TTTATCATTT CTGTCCCATT 2 760 

CCTAACAATA TATTGATTAT TTTTTTATTA CGAAACGATC TTCCACTGGA TTAAATGTTT 2 820 

TTTCGCCAGC AGCTTCACGA ATATCACCAA A TGG CATTTG AGCAATAAGT TTCCAACTTT 2 880 

TAGGAATATT AAATTCATTT GAAGTCATCT CATCAACAAG TGGATTATAG TGTTGTAATG 2 940 

■45 

AAGCACC7AT GCCTTTAGTA GCTAATGCAG TCCAAATTGC AAATTGATGC A TGG CATTTG 3 000 

TTTGAGTTGA CCATATTGCA AAATTATCAT AGTAGTTTGG CATTTGTTCT TGTAAACCAC 3 060 

50 TTACAACATC TTGATCTTCA TAAAACAAAA TTGTACCGTA TGAATGTTTG AAGTTATCAA 3120 

TTTTTTGTTC AGTTGGCTCG AAATCACGAT TCTCTCCCAT GACTTCTTTT AAAATTGCTT 3180 

TTGTGTTATC CCAAAATTTA TTATTGTTGT CATTTAACAA GAGAACAATT CTAGTTGATT 3 24 0 

55 



40 



218 



EP0 786 519 A2 

CATCGCTAAT TGATATCGAA TCTTTCAAAT TATATATTGA ACGTCTTTCT TCCATTGCAT 3360 

TGTCAAAAGT CATTGCTTTT TT AT CTTTTT TAAATAAGCC CAT AA IT ATT GCTCCTTCTT 34 2 0 

TAGTAAAGAA TACTTAATAG ACTAAGTATA AAA TIT ATA C TCGTACTTGT AAAGCAATAT 34 8 0 

TTACGAAAAT TTCAAGAATA TTAATATTCA TTTTCAAATT CCAAATATAA ATG CATTTTC 3 54 0 

AACGCATATT TATTATACTT AGATTAATAC TTACATGAAA AAGGGAGGTG TCTCGTGAAA 3 5 00 

TGTCATATCA TTGGTTTAAG AAAATGTTAC TTTCAACAAG TATTTTAATT TTAAGTAGTA 3 660 

GTAGTTTAGG GCTTGCAACG CACACAGTTG AAGCAAAGGA TAACTTAAAT GGAGAAAAAC 3720 

CAACTACTAA TTTGAATCAT AATATAACTT CACCATCAGT AAATAGTGAA ATGAATAATA 3 78 0 

ATGAGACTGG GACACCTCAC GAATCAAATC AAACGGGTAA TGAAGGAACA GGTTCGAATA 3 34 0 

GTCGTGATGC TAATCCTGAT TCGAATAATG TGAAGCCAGA CTCAAACAAC CAAAACCCAA 3 900 

GTACAGATTC AAAACCAGAC CCAAATAACC AAAACTCAAG TCCGAATCCT AAACCAGATC 3 96 0 

CAGATAACCC GAAACCAAAA CCGGATCCAA AACCAGACCC AGATAAACCA AAGCCAAATC 4 02 0 

CGGATCCAAA ACCAGATCCA GATAACCCGA AACCAAATCC AGATCCAAAA CCAGACCCAG 4 080 

ATAAACCAAA GCCAAATCCG GATCCAAAAC CAGATCCAGA TAAACCAAAG CCAAATCCGA 414 0 

ATCCAAAACC AGACCCTAAT AAGCCAAATC CTAACCCGTC ACCAGATCCC GATCAACCTG 42 0 0 

GGGATTCCAA TCATTCTGGT GGCTCGAAAA ATGGGGGGAC ATGGAACCCA AATG CTTCAG 4 26 0 

ATGGATCTAA TCAAGGTCAA TGGCAACCAA ATGGGAATCA AGGAAACTCA CAAAATCCTA 4 32 0 

CTGGTAATGA TTTTGTATCC CAACGATTTT TAGCCTTGGC AAATGGGGCT TACAAG TATA 4 3 80 

ATCCGTATAT TTTAAATCAA ATTAATAAGT TGGGCAAAGA TTATGGAGAA GTTACTGATG 44 4 0 

AAGACATTTA TAATATTATT CGAAAACAAa ATTTCAGCGG AAATGCATAT TTAAATGGAT 4 500 

TACAACAGCA ATCGAATTAC TTTAGATTCC aATATTTCAA TCCATTGAAA TCAGAAAGGT 4 560 

ACTATCGTAA TTTAGATGAA CAAGTACTCG CATTAATTAC TGGTGAAATT GGATCAATGC 4 620 

CAGATTTGAA AAAGCCCGAA GATAAGCCGG ATTCAAAACA ACGCTCATTT GAACCGCATG 4 6 80 

AAAAAGACGA TTTTACAGTA GTTAAAAAAC AAGAAGATAA TAAGAAAAGT GCGTCAACTG 4 74 0 

CATATAGTAA AAGTTGGCTA GCAATTGTAT GTTCTATGAT GGTGGTATTT TCAATCATGC 4 800 

TATTCTTATT TGTAAAGCGA AATAAAAAGA AAAATAAAAA CGAATCACAG CGACGATAAT 4 86 0 

CCGTGTGTGA TTCGTTTTTT TTATTATGGA ATAAAAATGT GATATATAAA ATTCGCTTGT 4 92 0 

TCCGTGGCTT TTTTCAAAGC CTCAGGATTA AGTAATTGGA ATATAACGAC AAATCCGTTT 4980 

TGTAACATAT GGATAATAAT TGGAACAGCA AGCCGTTTTG TCCAAACATA TGCTAATGAA 5 04 0 
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AATATTAATC AACTTACTGT TGTAGCAATA ATAAATGCCA CGATACGATT ACCTTTAATC 5160 

GCATTAAATA ATTCTCCAAA GATTACTTTT CTGAATACAT ATTCTTCTAA TAAAGGACCA 52 2 0 

ATAATAGATA CAAAGAAGAT AAATATAGGT ATTTTTCGAG CAATAATAAT TA3CTTTTCT 52 8 0 

GTATTAGGAC TTACTTGTTG TCCACCATAA ATTTGCGTTA ATACAATGCT CACTACCATT 534 0 

TGATAAATCA TTACCAATGC AAATCCAAGC AATGCC CATG GAATGATATA TTTTTTAGGT 54 00 

w 

TCTTTAACTT CTAATTCTAA TTTTGTTGGA TTTTTAATTT TTAAATTAAT TAAAATAATC 54 6 0 

GTCGTGGCGG CGATTAAAAA TAGAACAAGT TGTATGTAAA TGACTGCTTT AGTCAGTTCT 552 0 

is ATGCCACTAT ATTGTACAAA TGGTAATTTT TTTACAATGA GAAGCGGTAA AAATTG AG A C 55 8 0 

AATATATAAA TAATAACAGT TAGCAATGAT GCCCATAATC tTGTCATAAT TTTCCTCCAA 564 0 

ATATTTGTTT ATAATTTATT TTATCGTAAA TAACTTGAAG TTACAAAACT TAATTAAAAG 57 0 0 

20 GTTATGACTT GAAATTTTGA CCAAATTTGA TTATTATAAA TGTATGTTAG C ACT CTTTAA 57 6 0 

TGTTAAGTGC TAAACTTTAG GTTTTTTAAG GAGGAACAAT CATGCTAAAA CCAATTGGAA 5820 

ATCGTGTGAT TATTGAGAAA AAAGAACAAG AACAAACAAC TAAAAGTGGn ATTGTTTAAC 5880 

25 

TGATAGTGCT AAAGA 5895 
(2) INFORMATION FOR SEQ ID NO: 2: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6796 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: double 
(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
TTTGAAAAAA CAAGGTACGA TTGGTTTAAT AA CAT AT ATG AGAACCGATT CTACACGTAT 6 0 

40 

TTCaGATACT GCCAAAGTTG AAGCAAAACA GTATATAACT GATAAATACG GTGAATCTTA 12 0 

CACTTCTAAA CGTAAAGCAT CAGGGAAACA AGGTGACCaA GATGCCCATG AGGCTATTAG 180 
ACCTTCAAGT ACTATGCGTA CGCCAGATGA TATGAAGTCA TTTTTGACGA AAGACCAATA 240 

45 

CCGATTATAC AAATTAATTT GGG AACGATT TGTTGCTAGT CAAATGGCTC CAGCAATACT 3 00 

TGATACAGTC TCATTAGACA TAACACAAGG TGACATTAAA TTTAGAGCGA ATGGTCAAAC 3 60 

50 AATCAAGTTT AAAGGATTTA TGACACTTTA TGTAGAAACT AAAGATGATA GTGATAGCGA 4 20 

AAAGGAAAAT AAACTGCCTA AATTAGAGCA AGGTGATAAA GTCACAGCAA CTCAAATTGA 4 80 

ACCAGCTCAA CACTATACAC AACCACCTCC AAGATATACT GAGGCGAGAT TAGTAAAAAC 54 0 
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AAAGCGTAAC TATGTCAAAT TAGAAAGTAA GCGTTTTGTT CCTACTGAGT TGGGAGAAAT 660 

AGTTCATGAA CAAGTGAAAG AATACTTCCC AGAGATTATT GATGTGGAAT TCACAGTGAA 720 

5 

TATGGAAACG TTACTTGATA AG ATTG CAG A AGGCGACATT ACATGGAGGA AAGTAATCGA 7 8C 

CGGTTTCTTT AGTAGCTTTA AACAAGATGT TGAACGTGCT GAAGAAGAGA TGGAAAAGAT 84 0 

w TGAAATCAAA GATGAGCCAG CCGGTGAAGA CTGTGAAATT TGTGGTTCTC CTATGGTTAT 9 00 

AAAAATGGGA CGCTATGGTA AGTTCATGGC TTGCTCAAAC TTCCCGGATT GTCGTAATAC 960 

AAAAGCGATA GTTAAGTCTA TTGGTGTTAA ATGTCCAAAA TGTAATGaTG GTGACGTCGT 1020 

75 AGAAAGAAAA TCTAAAAAGA ATCGTGTCTT TTATGGATGT TCGAAATATC CTGAATGCGA 1080 

CTTTATCTCT TGGGATAAGC CGATTGGAAG AGATTGTCCA AAATGTAACC AATATCTTGT 114 0 

TGAAAATAAA AAAGGCAAGA CAACACAAGT AATATGTTCA AATTGCGATT ATAAAGAGGC 12 00 

20 

AG CG CAG AAA TAATATTTTT ATTTCCTAGA TACATTTTAA GATTGTTAAA TAGAATCATT 126 0 

AGTGAATCTT ATTTTAAAGA TAGTAAAGGA TTAATCTAAA TAAGTGCGGA TAATATAAAC 13 20 

ATAACAACAT AATTAAmAGA CATAAATGAC aATAAAAGGA GTATAGAAAT GACTCAAACT 13 80 

25 

GTAAATGTAA TAGGTGCTGG TCTTGCCGGT TCAGAAGCGG CATATCAATT AGCTGAAAGA 144 0 

GGAATTAAAG TTAATCTAAT AGAGATGAGA CCTGTTAAAC AAACACCAGC GCACCATACT 1500 

30 GATAAATTTG CGGAACTTGT ATGTTCCAAT TCATTACGCG GAAATGCTTT AACTAATGGT 1560 

GTGGGTGTTT TAAAAGAAGA AATGAGAAGA TTGAATTCTA TAATTATTGA AGCGGCTGAT 1620 

AAGGCACGAG TTCCAGCTGG TGGTGCATTA GCAGTTGATA GACACGATTT TTCAGGTTAT 1680 

35 ATTACTGAAA CACTTAAAAA TCATGAAAAT ATCACAGTTA TTAATGAAGA AATTAATGCC 174 0 

ATTCCAGATG GATACACAAT TATCGCAACA GGACCACTTA CTACAGAAAC CCTTGCGCAA 1800 

GAAATAGTGG ACATTACTGG TAAAGATCAA CTTTATTTCT ATGATGCGGC TGCTCCAATT 1860 

40 

ATTGAAAAAG AAT CTATTGA TATGGATAAA GTTTACTTAA AGTCCCGTTA TGATAAAGGT 1920 

GAAGCTGCAT ATTTAAACTG TCCTATGACT GAGGATGAAT TTAATCGCTT TTATGATGCA 1980 

45 GTATTAGAAG CTGAAGTTGC GCCTGTAAAT TCATTTGAAA AAGAAAAATA TTT CGAGGGT 2 04 0 

TGTATGCCTT TTGAAGTAAT GGCAGAACGC GGACGCAAGA CATTACTATT TGGACCAATG 2100 

AAACCAGTAG GATTAGAAGA TCCAAAGACT GGGAAACGTC CTTATGCGGT GGTTCAATTA 2160 

50 AGACAAGATG ACGCTGCTGG TACACTCTAC AATATTGTTG GCTTCCAAAC GCATTTAAAA 2 220 

TGGGGAGCTC AAAAAGAAGT CATTAAATTA ATTCCAGGCT TAGAAAATGT TGATATTGTT 2 280 

AGATATGGTG TGATG CATAG AAATACCTTC ATTAATTCAC CGGACGTATT AAACGAGAAA 2 34 0 
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T ATG TAG AAA GCGCAgcTAG CGGCTTAGTT GCAGGTATCA ATCTTGCGCA TAAAATATTA 24 6 0 

GGCAAGGGTG AGGTAGTATT TCCGAGAGAA A CAATG ATTG GAAGTATGGC TTACTATATT 2 52 0 

5 

TCTCATGCTA AAAACAATAA GAATTTCCAA CCTATGAATG CTAACTTCGG GTTATTACCA 2 58 0 

TCTTTAGAAA CTAGAATTAA AGATAAAAAA GAACGCTATG AAGCACAAGC TAATAGAGCT 2 64 0 

w TTGGATTACT TAGAAAATTT CAAAAAAACT TTATAAAATA GTTAGAAAGA CTAGATATGC 2 70 0 

TATTCATTCT TAAGTCATCA ACGAGTAAGT AATGACTTTC TAAATGGAAA ATACTTATCC 2 76 0 

TAGTCTTTTT AATTTTGGAA TTGTTACGTA TTTCTGACAA TTTAGAATTC GCATTCAAAA 2 820 

15 AATATCTAAA TAAATAACAC GCAATAAGTT GATTGATGTA ACATGTAAGA GAATGTTTTA 2 880 

AATAAACTTT ATTTAAAAGG CAATGAAATA ATAAATGGCA AGGCTATTAA TAAAGACTTT 2 94 0 

TAGTAATTAA TTTAAAAAAG AGGTATTCTA ATTAACAGGT TTTCCGATTA GTTACAATTA 3 000 

20 

TTTAATTCTC AAAAGATTTA GAATTGATTA TCAAATTACT GTAAGCCCTT TGCTGTATAT 3 060 

GCTACAATTC TTATTGATGG AGGGTAAATG TATTGAATCA TATTCAAGAT GCGTTTTTAA 3120 

ATACATTGAA AGTTGAACGG AATTTTTCGG AACACACATT GAAATCATAT CAAGATGACT 3180 

25 

TAATTCAGTT TAATCAATTT TTAGAACAAG AACATTTAGA GTTGAATACT TTTGAATACA 324 0 

GAGATGCTAG AAATTATTTG AGCTATTTAT ATTCAAATCA TTTGAAAAGA ACATCTGTTT 33 00 

30 CTCGTAAAAT CTCAACGTTA AGAACTTTCT ATGAATATTG GATGACGCTT GATGAGAACA 3360 

TTATTAATCC ATTTGTTCAA TTAGTACATC CGAAAAAAGA AAAATATCTT CCGCAATTCT 3420 

TTTACGAAGA AGAAATGGAA GCGTTATTCA AAACTGTAGA AGAGGACACT TCAAAAAATT 34 80 

35 

TACGGGATCG AGTTATTCTT GAATTGTTGT ATGCTACAGG CATC CGTGTT TCGGAATTAG 3 54 0 

TAAATATTAA AAAACAAGAT ATAGATTTTT ACGCGAATGG TGTTACCGTA TTAGGAAAAG 3600 

GGAGCAAAGA GCGCTTTGTA CCGTTTGGTG CTTATTGTAG ACAAAGCATC GAAAATTATT 3660 

40 

TAGAACATTT CAAACCAATT CAGTCATGCA ATCATGATTT TCTTATTGTA AATATGAAGG 3720 

GTGAAGCAAT CACTGAACGC GGTGTACGAT ATGTTTTAAA TGATATTGTT AAACGAACAG 3780 

45 CAGGCGTAAG TGaGATTCAT CCCCACAAGC TCAGACATAC ATTTGCAACG CATTTATTGA 3 84 0 

ATCAAGGTGC AGACCTAAGA ACAGTACAAT CGTTATTAGG TCATGTTAAT TTGTCAACAA 3 90 0 

CTGGTAAATA TACACACGTA TCTAACCAAC AATTAAGAAA AGTGTATCTA AATGCACATC 3 96 0 

50 CTCGAGCGAA AAAGGAGAAT GAAACATGAG TAATACAACA TTACATGCAA CAACAATTTA 4 020 

TGCTGTAAGA CATAATGGGA AAGCAGCTAT GGCTGGAGAT GGGCAAGTAA CGCTTGGTCA 4 080 

ACAAGTCATC AT 3 AAA C AAA CGGCAAGAAA AGTGCGACGT TTATATGAAG GTAAAGTGTT 4 14 0 
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ATTACAAGAG TTTAGTGGTA ACTTAGAAAG AGCTGCTGTT GAA7TGGCAC AAGAATGGCG 4 26 0 

AGGCGATAAA CAATTACGTC AATTAGAAGC TATGCTAATT GTAATGGATA AAGATGCTAT 4 3 20 

5 

TTTAGTTGTC AGTGGAACTG GCGAAGTTAT TGCTCCAGAT GATGACCTTA TCGCTATTGG 4 3 80 

ATCAGGAGGC AACTACGCAT TAAGCGCAGG ACGTGCATTG AAA CG CCA TG CATCGCA7TT 4 44 0 

GTCTGCTGAA GAAATGGCAT ATGAGAGCTT GAAAGTAGCG GCTGATATTT GTGTCTTTAC 4 500 

CAACGATAAT ATTGTTGTCG AAACACTATA ATAATCAGAG CACGATAAAT AATTACGAGC 4 56 0 

AATTAATTTT AGTTAAAAGA CGGAGGAATG AAATTAATGG ATACAGCTGG AATAAGATTA 4 62 0 

15 ACTCCAAAAG AAATCGTATC TAAATTAAAT GAATACATCG TTGGACAAAA TG ATG CT AAA 4 68 0 

CGTAAAGTGG CAATTGCCCT ACGTAATCGA TACAGAAGAA GTTTATTAGA TGAGGAATCA 4740 

AAGCAAGAAA TTTCACCTAA AAATATTTTG ATG ATTGG A C CAACCGGCGT TGGTAAAACT 4 80 0 

20 

GAAATTGCAA GAAGAATGGC CAAAGTTGTC GGCGCGCCAT TTATAAAAGT AGAAGCTACT 4 86 0 

AAATTTACTG AGGTAGGTTA TGTAGGACGA GATGTTGAAA GTATGGTTAG AGATCTTGTT 4 92 0 

GATGTTTCAG TAAGATTAGT CAAGGCGCAG AAAAAATCAT TGGTACAAGA TGAAGCAACA 4 98 0 

GCTAAGGCCA ATGAAAAACT TGTTAAGTTA TTAGTTCCAA GTATGAAAAA GAAAGCGTCT 5 04 0 

CAAACGAATA ATCCTTTAGA GTCACTTTTC GGAGGTGCAA TTCCAAATTT CGGACAAAAT 5100 

jo AACGAAGATG AAGAAGAACC ACCTACTGAG GAAATTAAAA CAAAACGTTC TGAAATTAAG 5160 

AGACAGCTAG AAGAAGGCAA ACTTGAAAAA GAAAAGGTAA GAATTAAAGT CGAACAAGAT 52 2 0 

CCTGGTGCTT TAGGTATGCT AGGTACAAAT CAAAATCAGC AAATGCAAGA GATGATGAAT 52 8 0 

55 CAATTAATGC CTAAAAAGAA AGTTGAGCGA GAAGTTGCTG TTGAGACGGC AAGGAAAATC 53 4 0 

TTAGCTGATA GTTATGCGGA TGAACTAATT GATCAAGAAA GCGCTAACCA AGAAGCGCTT 54 0 0 

GAATTAGCAG AACAAATGGG TATCATCTTT ATAGATGAAA TCGACAAAGT TGCGACGAAT 54 60 

-JO 

AATCATAATA GTGGTCAAGA TGTCTCAAGA CAAGGTGTTC AAAGAGATAT TTTACCTATA 552 0 

CTTGAAGGTA GCGTTATTCA AACCAAATAT GGTACTGTGA ATACTGAACA TATGCTGTTT 558 0 

JS ATAGGTGCTG GAGCTTTCCA TGTATCTAAG CCGAGTGACT TGATACCAGA ATTGCAAGGT 564 0 

CGTTTTCCGA TTAGAGTTGA ACTTGATAGT TTATCGGTAG AAGATTTTGT AAGAATTTTG 5700 

ACAGAACCAA AATTGT CATT AATTAAACAA TATGAAGCAT TGCTTCAAAC AGAAGAAGTT 5760 

50 ACTGTAAACT TTACCGATGA AGCAATTACT CGCTTAGCTG AGATTGCTTA TCAAGTAAAT 5 82 0 

CAAGATACAG ACAACATTGG TGCACGTCGA CTTCATACAA TTTTAGAAAA GATGCTAGAA 5 8 80 

GATTTATCAT TCGAAGCACC AAGTATGCCG AATG CAGTTG TAGATATTAC CCCACAATAT 5 94 0 
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AAATATACAA AAGGAGAAAA ATTCATGAGC TTATTATCTA AAA C GAG AG A GTTAAACACG 6 06 0 

TTACTTCAAA AACACAAAGG TATTGCGGTT GATTTTAAAG ATGTAGCACA AACGATTAGT 6120 

AGCGTAACTG TAACAAATGT ATT7ATTGTA TCGCGTCGAG GTAAAATTTT AGGATCGAGT 618 0 

CTAAATGAAT TATTAAAAAG TCAAAGAATT ATTCAAATGT TGGAAGAAAG ACATATTCCA 624 0 

AGTGAATATA CAGAACGATT AATGGAAGTT AAACAAACAG AATCAAATAT TG AT AT CG AC 63 00 

TO 

AATGTATTAA CAGTATTCCC ACCTGAAAAC AGAGAATTAT T CAT AG ATAG TCGTACAACT 6 36 0 

ATCTTCCCAA TTTTAGGTGG AGGGGAAAGA TTAGGTACAT TAGTACTTGG TCnAGTACAT 642 0 

is GATGATTTTA ATGaAAATGA TTTGGTACTA GGTGAATATG CTGCTACAGT TATTGGTATG 64 80 

GAAaTCTTAC GTGAGAAGCA TAGTGAAGTA GAAAnAGAAG CGCGCGATAA AGCTGCTATT 6 54 0 

ACAATGGCAA TTAATTCATT ATCTTATTCT GAAAAAGAAG CGATTGAACA TATCTTTGAA 6600 

20 GAACTTGGCG GTACGGAAGG CCTATTAATC GCATCAAAAG TTGCAGATAG AGTTGGTATT 6660 

ACTAGATCTG TAATTGTAAA TGCACTACGT AAATTAGAAA GTGCTGGTGT AATTGAATCA 6 720 

CGTT CTTTAG GAATGAAAGG TACTTTCATT AAAGTTAAAA AAGAAAAATT CTTAGATGAA 6 7 80 

25 

TTAGAAAAAA GTAAAT 6 7 96 
(2) INFORMATION FOR SEQ ID NO: 3: 

30 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(D> TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATCCTAAAAT TnAAAATTAT CACGCCTTTT GaACAGCTTT GTAACCaTCt GGACGATCAT 6 0 

40 

kAAATTCCaA TGTAAATCCT GGTTTAAaGT TGATCTTTAA CCTTATTTAA AyCACCAATT 12 0 

GTACGTATAT TATGTTGTTT AG CAAAAT CA CGTTTTACAG CTAAAGCATA CGTATTGTTA 180 

JS TACTTCATTG GTTTTAACAT AGTCATTTGA TATTTCTTTT CAAGACTTTG CTT AG CTTGT 24 0 

TCATAAACTT TTTTCTCTTC TTTTGACTTC AATGGTTCTT TTGTTAATTC ACCTAAAACT 3 00 

GTTCCAGTAA ATTCTAAATA CCCATCTATA TCGTCAGATT TTAAAGCATT AAATAAAAAT 3 60 

50 GCTGTTTTGC CCATACCATC TTTCACTTCT ACAGTATTTT TGGTCTCTTC TTCTATTAAA 4 20 

ATTTT A T A CA TATTTGTAAT AATCGATGGC TCGGAGCCAA GCTTTCCAGC TAACGTAATT 4 80 

TTATCACCTT TTTGTGCAAA CATAGGAATA GCGATAGCCA GTATAATAAT CATCACTATA 54 0 
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TCAAA7A7AA TTGCCAATAA GGCTGCTGGA ATTGCACCTA ATAATATCAA CGATGCATTG 6 60 

TTACGGTCTA TAC CTAATAA AATTAAATCT CCTAGTCCGC CTGCACCAAT TAATGCTGCT 72 0 

5 AGT3TTGCTG TACCTATAAT TAATACCATA GCCGTTCTTA CACCAGCCAT TATAACAGGC 780 

ATTGCTATCG GAAGTTCGAC TTTAGTTAAA CGTCTAAATG GTTTCATACC TATACCTTTA 84 0 

GCCGCTTCAA TGAGTGATGG ATCAACTTCT 7TAATTCCAG TATACGTATT CCTTAAAATT 900 

w 

GGTAACAACG CATACACTAC AAGTGCAATA ATTGCTGGCA CACGACCGAT ACCAAATAAA 96 0 

GGAATCATTA AAC CTAATAA TGCCAACGAT GGTATGGTTT GAAGAATTGC CGCAATATTC 102 0 

15 ATTACGATTT CAG AT AT CG T TTTAGTCTTC GTTAATAAAA TACCTAATGG TACCGCAATA 108 0 

GCAGTTGCAA TCAATAATGC GATAAATGAT ATTTGAATAT GTTCTATCAT TGTCGAAAAG 1140 

AGTTGCCCCT TACGTTCACT CAATATGTCg AAAAAGTTAG TCATGTTGAG CTACCTCCTT 1200 

20 TTTCTGGGAC AAATATTTGA AGATATCTTT CCTATCAATA ACATATTGAC CTACGCTATC 126 0 

TTCTTGCATG ACAATGACAC GCTCGCTCTC TGATAAAAGT TGATACAATA CTTCAATTGG 13 20 

TTGATTGTCA TAAACAATTG GATAAGCGCT CATAGATGTA ACCTCATCGA TTGGTTTCAT 13 80 

25 

AATATCCAAG TCACGGATAA TTGCGTTCTC TTCAACACAT GGCGCATCAT CTTCTAAATG 144 0 

ACTACCCATA AATTGTTTAA CAAATTCACT TTGAGGATTA TTTTTAAATC CTTCTGGTGT 1500 

GTCAATTTGT TCAATATGCC CTTCATTCAA AAGACAAATC TTATCACCAA GTTTCATCGC 156 0 

30 

CTCTTGAATA TCATGTGTAA CAAATATGAT TGTCTTCTTA ATTTTAGTTT GTAATTCAAT 1620 

TAAA7CATCT TGAAGTTTTT CTCGGCTGAT TGGGT CTAAT GCACTAAACG G XT CATC CAT 1680 

35 TAAAATAACT GGTGGATCAG CTGCTAACGC ACGTATAACT CCTACACGTT GTCGTTGCCC 174 0 

CCCTGACAAT TCATCAGGTT TTCTGTTTTT ATATTTTTCA GGTT CTAAT C CAAC CATTTC 1800 

AAGTAATTCA TCTACTCTTT TATCTATATC TTTTTCTTTC CACTTTTTCA TTTGTGGCAC 186 0 

TTGTGCAAt A TTTTCTTTGa wTGTCaTATG TGGGAATAAT GCAATCTGCT GcAATACGTA 1920 

TCCAATATCC CAACkCATTT CGTATACTGG ATAATCACTT ATTGGTTTAT CTTTAAAATA 1980 

AATATAACCT TCACTTAAGT GAATGAGTCG ATTAATCATT TTTAATGTCG TAGTTTTTCC 204 0 

ACAACCTGAA GGTC CAATTA GCACAAAAAA TTC 2 073 
(2) INFORMATION FOR SEQ ID NO: 4: 

so (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13321 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 
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Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ACTATTCTAG CTTCATCAGT TATCATATAT TCTTTGAAAC ACTTGTAAGA AAATATAATG 6 0 

5 AGTATTTACT ACATAATGAT ATTTCAAATT AGAAAAAAGG AAGTTATGAT TTAATGGCCT 12 0 

TGAGCCTATC ATAACTTCCT TTTATCATTT TATTGTTGTG TTGATGTTTC GATAACGTGG 180 

TACATCTTAT CAAACATCAA TTCGAAACCA TGCACCATGG CATCATGATA TTCTTTTTTC 24 0 

10 

TTTTGCTTGT ATTCTAAATT AGTAAATCGT CTTTCTTTTT CAACTAATGA ACGATAATAA 3 00 

AATAGCATTT GGGTGCCACC TGTTTCACGT TCAAAAAATT CTACCTCAAT GACATCTTGC 36 0 

1S GTTTCACTTA GTCCAGG CAT ACCGATAGTC ATCTTAACGT ATT CATC CAT AACTAAAGAT 4 20 

TCATAAATGC CTTCAATCAC ATTTACTTTG CCATTACGTT GTTGATCTAC AATACGATAT 4 80 

TTACCGCCTT CTTTAACGTC CGCTTCAATC TCTTTATTCG TTCTGGCTGA TGTCATAAAC 54 0 

20 CATTGTTTCA ACAAATCTTT CTTTGTCCAA GCTTCGTATA CTAACTCTGG AGAAAATTTA 600 

TAAAGCTTTT CAATTTCAAC TTCGACATGT TCATTCTCTA CATTAAATTT TGCCACTGTT 6 60 

GTCCACCCAC TTTCGCTCTT ACTTTTATTT TAACGTATTT TTGCTCAGTT CCAAACATAG 72 0 

ATGATCATCA TTTTTAAAAG ATTAGCGTTA TACGGTGAGT ACAACATGAT CTGTTAATAT 7 80 

AACAAGCCAC CTTACTTGGC TACATCGATA TATTGTTAAG CATTAATGTT TCATTTCTTG 84 0 

ACTAGTGTTC TTTTTT AG CT TTGGAAAATT AAATAAAATC GCAATAAGTC CGCATACACC 900 

TAATAATATA GGATAAATGC TGTATGGGAA TAACATTAAC GGTGAAATAC CAGCTACACC 9 60 

AGCCGCTGaA ATGACTTGCG GGCTATATGG TAATAAACCT TGGAAGCAGC CTCCAAATAT 10 20 

35 ATCAAGAATA CTTGCTGATT TCCTTGAATC TACATCATAT TCATCTGCAA TATTTTTAGC 10 30 

TAAAGGACCT GACATAATAA TAGAGATGGT GTTGTTTGCC GTGGCAATAT CTGCGACACT 1140 

TACC&AACTA GCAATTCCTA ATTCTGCGCC ACGCTTTGAT TTCACTTTAG AGCGAACAAA 12 00 

TTGCAACAAC CATTCAATAC CACCATTGTG TTGAATAATA CCGACTAAAC CACCAATTAG 1260 

CAACGCAATC ATAGCAATAT CTTCCATGCT TATAATACCT TTGGACACTG CATCTAGTAG 1320 

CCCCATCCAA CCGAATGAAC CATCTATGAG ACCAATGATT CCGGCTAATA ATGTTCCGCC 13 80 

AATCAATACG ATAATGACAT TTACACCTAA TAATGCTAAT ACCAATACTA AGATATACGG 144 0 

TACAACTTTA ATT AG ATT AT AATCATAGTt TTTAGCATGA TTTAAAGAAA TGCCATTCGT 150 0 

so TAAGAAATAC AGAATAATAA TCGTTAAAAT AGCACCTGGC AATACAATTT TAAAGTTTAC 156 0 

TCTGAATTTA TCTTTCATTT TCGTATGTTG TGTTCTAACC GCAGCAATTG TTGTATCTGA 162 0 

AATCATTGAT AGATTATCGC CGAACATTGC ACCTCCAACA ACTGT AG CC a tTGcCAGCGC 1680 
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TCCTACAGAC GTCCCCATAG ATATAGAAAC AAACATACAA AT CA C AAA C A ATCCTACAAT 1B0 0 

AATTAAATTT TCTGGGATTA ATGATAGTCC TAAATTAACT GTCGACTTTA CGCCACCCAT 136 0 

5 

TTTTTCAGCT GTATTTGAAA ATGCACCTGC TAAAATAAAA ATCAACATCA TTAAAACAAT 192 3 

GTTTGAATGG CCTGCACCTT TCGTGAAGAC CTCAACTTTT TTAGCAAATG ATTCTTTTCG 198 0 

AT7CATTAAT AACGCCACAA TTACCGTTAT CGTAATTGCA ACATTTAATG GCATTGAAGT 2 04 0 

w 

AAAATCACCT GTGATAATAC CTACGCCTAA AAACAACGCC ACAAATAATA ACAAGGGGAA 2100 

TAATGCCCAA GCATTGCTCT TTTTATGTAC TTCCATCCTT TTTACCTGCT TTCCAATTAA 216 0 

'5 AAATACCTCT TTCTCACAAA CGATGAAGAA AGAGGTTTTC ATGTGCTTTA CCTGCTTATC 222 0 

TTCAAACCAT TACGGTTACT GGAATTGGCA CATTCGAGAT GTTGCCGAGG CTTCATAGGG 22 8 0 

CCAGTCCCTC CACCTCTCTA GATAAGTGAT GCTTATTTAC GTTTACGTTA CAAGATAATC 2 34 0 

20 

CTTAGTACGT CAATCATAAA TTAATCAGGA GTCGTATAAT ATTTTTCATA AACAATCATT 24 0 0 

GCTACTGTAA TAATAATCAA AACAATAATG CTAATAACAA GTAAAAGCCA CCATTTAAGC 24 6 0 

ATTAATGCAA TAAAAATGAA CACGATAGAC ACACTTACTA ATATTAATGA TATGACTTTA 252 0 

25 

AATTGCTGAA CACGTTGCTT GGAGATGACT TTCAACTGTT TGTTTGATAG ACGCGTATTT 2 58 0 

TTTATACTGA TTCCCAGTAT ATTTTCTAAT ATTTGAACCA ATA CG ATA CT TATTG CAAAT 264 0 

30 ATAATAATTG GTAAAACATC ATAGCTCCCT ATAGTTAATG TATAAATTAC AAATCCAATG 27 0 0 

TAAAGTAACC CTGAGACAAA GGATAAAAAG TATGCGACGT ATTTGTTAAA CTTAATGATA 276 0 

TGCTTTTTAA CGTTTTGATG TGTAAACCAT ACATTCGAAA CGATCGCAAC TGCTACAAAT 2 82 0 

35 AATGTGAATA CTATATATAA TGGTAATTTT TGTTCAGGAA AAACAGTCGC TATTCCAAAA 2 88 0 

GCTAATGCTA AAATCAAAAA TAATATAGCT CTAGATACTA TTAATG C CAT AATAACAACC 2 94 0 

CCTTTGTTTA ATATCGAGTT TGCAAATTTA CGTTTAT GAG CGTTT CTATG ATCAGTACTT 300 0 

40 

CTACGGGTAG CGTTT CTATG TAATTTACAT CATCTTAACA TATAAATACT TCGCTATTTA 30 6 0 

ATTGAAAACA TATCCTATTA TTCTTTGTCC GTTCTGACGT TTAATATCTA GCCTTAGGCA 3120 

45 TTTCACTTGT TAATGAATTT AACTTTCTTC CACTAACCGT CCCTAAACCC AATCCCGCAA 318 0 

CAGTTTTTAA CTTTTTCGTT GTTGTCCTGA CATCCTCATT AAGAAAGTTT ATTCTGCTTA 3 24 0 

AAACTTATAA TCCACACCCT GAGCAAACGC TCCTTATGAC AGAGTATTAA AATAAGCCGA 33 0 0 

50 TAAAGATACA CACCTTTACC GACTATTTAA AATACACTTC ACCAATTCAT TTTAATTTAA 336 0 

TGGATTGAAG TAACTAAATT AATATTATGT TGTTCAATTA AAAGCTTCAT ACAAACCTAA 34 2 0 

TCTATTTGCA CTCCACCGCT AACACCGAAC ACTTGTCCGG TTGTATAACT TGATTCTTCT 34 8 0 
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GTTTTTTGAC CAAATGTTGG GATTTTACTT TGAGGTTGTC CACCAGAAAT TTGTAATGGT 3 6 00 

GACCAGAATG GACCAGGCGC TACACAGTTC ACTCTAATTC CTTTTGGTCC TAATTCTTCT 3 66 0 

5 GAAAAACTTT TAGTTAATGA AATAATTGCT GCTTTTGAAG CGGCATAATC ATGAAGAATA 3 72 0 

GGACTAGGAT TATAACCTTG TACAGATGAT GTCGTTGTAA TTGACGCACC CGGTTTTAAA 3 78 0 

TATTCCAATG CTTTTTGAAC TGTCCAAAAT AGCGGATAGA CATTCGTTTC AAATGTTTCT 3 84 0 

w 

GTAAATGCCT CAGTTGTAAA TCCATGAATA TCATCATGAT ACTGTTGATG TCCAGCAACT 3 90 0 

AAAGTAACAT TATCTAAGCC ACCTAATTGT TGATATGCTT GTTCAACAAG GTCATAGTTG 3 96 0 

JS AACTGTTCAT CTCTTATATC ACCAGGAATT AACACTGCCT TTTGACCACT TTCTT CAATC 4020 

ACTTGGCGTA CTTCTTGTGC ATCTTGTTCT TCACTCGGAA GATAGTTAAT CGCTACATCT 4 08 0 

GCACCTTCTT TAGCATACGC AATTG CTGCT GCACGCCCTA TTGCTGAGTC ACCACCTGTG 414 0 

20 ACTAATATTT TATAGCCTTG TAAGCGTTGA TGACCTTGGT AAGACGTTTC GCCACAATCG 4 200 

GGTGCTGGCG TCATTTCAGA TTGTAAACCC GGTACCTCTT GTTCTTGTTT TTCATAATCC 4 26 0 

GTTGTTTTAA ATTTTGTTCT AGGATCTTGA GCTGCCATTT TTTTACATCT CCTTATTCGC 4 3 20 

25 

TTAATGGTTA TTATTTACCC AATCTTCCTA GGAACTTAAT CATGATTACA CTAAAAATTA 4 3 80 

CTTTCTTCTT TATAAAAACA AGCTCGAATT ATTCATGCAA TAGTCTCTTT ACAAATTCAA 4 44 0 

CAAAATACTC AGGT AC TTTT TCCAGAATCC TTTCATCCGG TTTATATTGA GGATGATGTA 4 5 00 

30 

AATCATATTC ACTATGAGAA CCAATTAACG CAAATACACT TGGAAAATGT TGACTATAAC 4 560 

CTGAAAAATC TTCTCCAATC GTAAGCGGCT GTTCCATCAT TCCCACCTTA TATCCAACAT 4 62 0 

35 GTTGGGCTAC TGCAATTGCT TTATGCGTCA ATGCCTCATC ATTCATCACA GCGCCAGGTA 46 8 0 

AATGCGTATA ATTTAAATTA ATTTTCATAT TATATGCTTG AGCCAATCCG TCCGCAATAT 4 74 0 

CTTGTAATCG TGTTTCTACA AGCTTTCGTA CCACAGGATC AAAACTACGC ACTGTGCCTT 4 8 00 

40 

GTACATACGC ATGATCAGCA ATGACATTCC AAGTATTACC ACATGATATT TGTCCAATTG 4 8 60 

TTACTACCGC TT CATC AAA C GCAGATAGAT TTCTACTAAC TATGGATTGA ATACTATTAA 4 92 0 

TCAATTGCGC CAACACAATA ACTGGATCGT TGCATTGTTC TGGcTTTGCA GCATGACCAC 4 98 0 

45 

CCACGCCTTT AATATGAAAC TCAAAACGAT CTACTGCTGA TGTAATTGCC CCTGTTTTGA 504 0 

TTGCAAATGT ACCTACCGAA CGCGATGGGT CATTATGAAA ACCCAATACT GCTTGTACAT 5100 

SO CTTTTAATGC ATGTGTTTCA ATAATTTTAA AAGCGCCATG TCCTAGTTCT TCTGCTGATT 516 0 

GAAAAATGAA TTTAACACGC CCAGTAAGAG TGCCCTCAAT TTCTTTTAAT TTTACAGCTG 52 20 

TAG CC AAAAT ACTAGCCATG TGAATATCAT GACCACACGC ATGCATAACA CCTTCATTTT 528 0 
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CAGCTATACA ACTCAGACCT TGTCCCACTT CAGCAACAAG CCCAGTCGCA AGTGGTAAGT 54 00 

CTAATATTCT AATATGATGT TCTGTTAAAA TATCTTTAAT TTTTTGTGTA GTCTTAAATT 54 6 0 

5 

CTTTATCGGA TAGTTCTGGA AATTGATGAA AATACCTTCT CCA3GTAACA GC1TGATCTT 5 52 0 

TTAATCCCAT CGGTCATTCC CCTTCCTTAA GTCAATGATA TGTTGTCTAC CCTACGATGA 5 5 SO 

TCATCTTTGA C7ATTAAACG ATGATTTCAC AACAATGTAC TCTTGTTAAT TGCTTTCGTT 564 0 

w 

AATGATAGAC AGTTGTTTAA TAATATCGTA ACACTGTTGT CAAACTATTC TAACTTTTAT 5 700 

AATTGAGACT CTATACAAAA ACGTGTTCTC GAATATACTT GTTTTTACAA ACCACAAAAA 57 6 0 

15 GCTCTAAACA TTAGTTTAAA CCAATGCTTA GAG CTTTCTA ATTATTTTAT GCTTTAAAAG 5820 

ATACTGTGTT ATCTACGATG ACCTTACCGT CTTTAATAAC TTTTTCTGCG TGATTGATAC 5880 

CAAAATGATA TGGAATATAT TCATGATTTG GTGCATCCCA AATTACTAAA TTAGCCTTAT 594 0 

20 

CACCTGTGTT AATTGTACCC GCGTTAATGT CTATTGCTTT AGCAGCATTG ACCGTAACAG 6 00 0 

CATTCCAAAC TTCATTAGGT GATAGCTTTA ATTTCAAGGC TGCAATCGCC ATAACAAGTT 6 050 

GTAAG TTGTT TGTGACACTA CTACCAGGGT TATAATCAGT TGCTAATGCA ATCGCACCGT 6120 

25 

TATTGTCAAG CATGCCTCTT GCATCTGCAT AAT CTTCTTT ACCTAAATAG AACGTCGTTG 6180 

CAGGTAAGAG GACAGCTACA GT AT CACT AT TTCGCAACTT TTCTTTTCCT TTAT CACTAG 6 24 0 

30 AAGCTACTAA GTGGTCTGCT GATATTGCTT GTTCATCAAT TGCTAATTCC AGTCCGCCTA 63 0 0 

ACGGATCAAT TTCATCCGCA TGTATTTTCA CTTTAAAACC TGCTTCTTTG G CTTTTTG CA 63 6 0 

TATAATGTTG CGATTGTTCT ATTGTAAATA CACCTGTTTC ACAGAAAATA TCCGCAAAGT 64 2 0 

35 CTGCATATTG TTTTACTTCC GGAAGTAACG CAATCATTTC TTCTAAAAAT GCCTCATTTG 64 80 

AACTTGCCTC TTTAGGTACA GCATGAGGCC CTAGGAAAGT ATGTTTCATG TCT AAAT CAT 6 54 0 

ATTTCTCAGC TAAACGATTA GACACTTTCA ATTGCTTCAG TTCATTTTCT CTATCTAATC 6 6 00 

40 

CAT AAC CACT CTTACTTTCA ACTGCAAGCA CGCCGTGTTT AAT CAT AGT A AGCAAATCAT 6 660 

GCTCTGCTTT TTT AAA C AAG TCATCTTCGG ATGTTTCTCT AGTAGCATTA ACGGTAGATA 6 72 0 

45 ATATGCCACC ACCCATTTCT AATATTTCAA GGTAAGACTT ACCTTGACGT TTTAATGACA 6 780 

TCTCATGTTC TCGAGATCCA CCAAATGTTA AATGGGTATG TGCATCTACT AATGCTGGGG 6 84 0 

ACACTACCTT CCCACTAGCA TCAATCGTCT CAGTCGCATC GTAGTCATCT GTATGTGTTC 6 900 

50 CAGCATATAC AATTTTGCCA TCTTTAATGA CAACTGTACC ATTTTTCACA ACATTTAATT 6 96 0 

CATCTAATTC CTTACCCTTC AAAGGTTTAT CTGTTGATCT CGGTAAAATT AATTCTGCTA 7 020 

TATGATTAAT T ATT AAAT CA TT C ATT AC TT ATCACCTGCT TTATCAATCA TTGGAATATG 7 080 
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20 



25 



AACACCCATA CCTGGGTCAG TCGTCAATAC ACGTTCCAAT CTTCTTTCAG CACGCTCTGA 72 0 0 

TCCATCTGCT ACAACAACCA TACCCGCATG AAGTGAATAT CCCATGCCAA CACCGCCACC 72 6 0 

GTGATGGAAT GAAATCCATG AACCACCTGC AGCTGTGTTA ATGAGTGCAT TCAATACAGC 73 20 

CCAATCACCA ACCGCGTCAC TACCATCTTT CATACTTTCT GTTTCACGGT TAGGACTAGC 73 80 

AACTGAACCA GCATCTAAAT GGTCTCGTCC AATAACAATT GGTGCTGAAA TTTCACCGTC 74 4 0 

ACGTACAAGA CGATTTAAAG CTAAGCCCAT TTTCGCTCTT TCTCCATAGC CTAACCAAGC 7500 

AATACGTGAT GGTAGTCCTT GATATGAAAT TTTTTCTTCA GCTAAATCAA GCCATCTTAA 7 56 0 

TAACTTTTCA TTTTCTGGGA AAAGTTTGCG CATTTCTTCA TCCGCACGCT CGATATCTTT 7 620 

TGGATCACCA CTCAACGCAG CAAAGCGGAA TGGCCCTTTA CCTTCACAGA ATAATGGTCT 76 8 0 

AATGTAAGCT GGTACAAAGC CTGGGAAGTC AAAAGCATTT TTCACTCCGT TATTGAAGGC 7 74 0 

TACTTGACGA ATATTGTTAC CATAATCAAA TGCTACAGCG CCACGTTTTT GGAATTCAAG 7 800 

CATTAATTCA ACATGCTTTG CCATTGAAGC TTGTGACAGT TCAACATATT TTTTCGGATC 7 86 0 

TTTTTCACGC AATACTTTCG CTTCTTCTAC AGAGTATCCT TGTGGCACAT ATCCATTTAG 7 92 0 

CGGATCATGT GCACTTGTTT GGTCAGTAAT AATGTCAATT TTAAATCCTT TTTCTAGAAT 7 980 

CGCTTGATGG ATGTCTACAG CATTTCCAAC TAACCCGATT GATAATCCTT CTCCACGTTC 8040 

TTTCGCCTCT TCTGCTAATT TTAATGCTTC ATCTAAATCA GCTGTTTTAA CATCACAGTA 810 0 

TTTCGTATCA ATTCGCTTAT CAACACGTGT TT CAT CAAC A TCCACGCAAA TTGCTACCCC 8160 

ATGATTCATA GTAATTGCTA ACGGTTGCGC ACCACCCATA CCACCTAAAC CTGCTGTCAG 82 2 0 

TGTAACAGTG CCTGCTAAAT CTCCATTAAA GTGTTGATTA CCTAGCTCGG CAAATGTCTC 8280 

ATAAGTACCT TGCACAATAC CTTGAGAACC AATATATATC CAACTACCGG CTGTCATCTG 83 4 0 

TCCATACATG ATTAAACCTT TTTTATCTAA TTCATTAAAA TGATCCCAGT TTGCCCATTC 84 0 0 

AGGCACTAAT ACTGAATTTG AAATTAATAC ACGTGGCGCT TCTTCATGTG TTTTAAATAC 84 6 0 

AGCAACTGGC TTTCCTGATT G TACT AA CAT TGTCTCATCT GATTCTAATT CTCGTAACGT 8 52 0 

TTTCTCTATT GCTTCAAAAG CTTCCCAATT ACGTGCTGCT TTTCCAATAC CACCATAAAC 8 5 80 

AACTAAATCT TCTGGTCTTT CAGCAACTTC TGGGTCTAAA TTGTTGTATA ACATTCTAAG 8 64 0 

TACTGCTTCT TGTTCCCAAC CTTTACACTC AATACTCAAA CCTTTTTTTG CTTGAATTTT 8 70 0 

TCTCATAAAA TTCGCTCCTG TTCTTTTAAG AAGTTAATTC CACTAAATTT AAAACGCTTA 8 76 0 

CATTATTATC TTCAATATTC ATTATAGTAT GTTAAAATAT AG C CAAC AAA TATAAATAAA 8820 

CTAATTATCC ATAGCTTGAA TCTATAAATA AAAGGAGCAA AACACATGAA AATTATTCAG 8880 
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CAT ATT AG CC AGCCATCTTT AACTGCTACG ATTAAAAAAA TGGAAGCAGA TTTAGGTTAT 9 3 00 

GACTTATTTA CACGTTCAAC AAAAGACATC AAGATTACCG AAAAAGGAAT ACAGTTTTAT 9 0 60 

5 

CGTTATGCGA GCGAATTAGT TCAACAATAT CGATCCACGA TGGAAAAAAT GTATGATTTA 9120 

AGCGTTACAT CAGAACCAAG GATAAAAATT GGGACTCTTG AATCTACGAA TCAATGGATT 9180 

w GCGAATTTAA TTCGAAAGCA CCATTCCGAC TACCCTGAAC AGCAATATCG TTTATATGAA 924 0 

ATACATGATA AACATCAATC TAT AG AG CAA TTACTGAATT TTAATATTCA TTTAGCTATA 93 0 0 

ACAAATGAAA AAATAACCCA CGAAGATATA AGATCCATTC CTTTATATGA GGAATCTTAC 936 0 

' 5 ATTTTATTAG CACCCAAGGA AA CATTT AAA AATCAAAATT GGGTAGATGT TGAAAATTTG 94 20 

CCACTCATAT TACCAAACAA AAATTCTCAA GTGCGCAAAC ACTTAGATGA CTATTTTAAT 94 80 

AGAAGAAATA TTCGTCCAAA TGTCGTTGTA GAAACAGATC GATTCGAATC AGCAGTTGGA 954 0 

TTTGTTCATC TCGGCTTAGG TTACGCTATC ATTCCGAGAT TTTATTACCA AT CATTT CAC 96 00 

ACGTCTAATT TAGAATATAA AAAAATTCGT CCAAACTTAG GCCGAAAAAT TTATATCAAT 96 60 

TACCATAAAA AACGCAAACA CTCCGAACAA GTACATACAT TCGTACAACA ATGCCAAGAT 972 0 

TATTTATATG GACTTTTAGA GGCTCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 978 0 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 984 0 

30 CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 99 00 

CTCAGTCAAC TGTATACCTT TTTCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9 96 0 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGTGCCTCT TATGTAGTTG 1002 0 

35 

CGTAGTCAaC TGTaTACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 100 80 

CGCAGATCAT CGTATAAAAA TTAATGACGT CATTTCAAAA AT CG ATA CAA AAATAATTTA 1014 0 

TTATAAAAAT TCTAAGAAAG AAGTGAAGCA GATGTTAAAA TCTATTAATC ATATATG CTT 102 00 

40 

TTCAGTCAGA AATTTAAACG ATTCAATACA TTTTTATAGA GATATTTTAC TTGGGAAATT 1026 0 

GCTATTGACT GGTAAAAAAA CTGCTTATTT TGAGCTTGCA GGCCTATGGA TTGCTTTAAA 103 2 0 

45 TGAAGAAAAA GATATACCAC GTAATGAAAT TCACTTTTCA TATACACATA TAGCTTTCAC 103 80 

TATAGATGAC AGCGAATTTA AATATTGGCA TCAGAGGTTA AAAGATAATA ACGTGAATAT 10440 

TTTAGAAGGA AG AG TT AG AG ATATTAGAGA TAGACAATCA ATTTACTTTA CCGACCCTGA 105 0 0 

50 

TGGTCATAAG CTAGAATTAC ATACTGGCAC ACTTGAGAAC AGATTAAATT ATTATAAAGA 1056 0 

GGCTAAACCA CATATGACAT TTTACAAATA AGGTGTCATT AT AAAAAGG C CTCTTGAACT 10620 

CCGTTAAAAT TTTAATTAAT TATTATATAA TAAGAGAACT TTTCAAACAA TACAGTTGTT 10680 
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TTACTGCAAT TATTTTTCAA ATATATCAAC GTTAATATAA CTTCTATTAA GAAATACTCA 10 8 00 

CATTCTGCCC TGCAATGCAA ATCTCGTCAC ATATAAATA7 TTTTAATTAT TTTAAAAAAT 10860 

GATGCACTAA ATTAGCAACG AGCTTAGCAG TTCTATTGTC AGCGTCATAT GTTGGATTCA 10 92 0 

TCTCAGCAAT ACTAACTGAA GACACCTTAT CACTTGGAA7 AATACGTTTT GCTAATTCAA 109 80 

GAACAGTATG TGGATACAAA CCTAACACTG CCGGCGCACT TACCCCAGGC GCAAACGCAC 11040 

10 

TATCAATGAC ATCCATACAA ATCGTAAACA TAATGACATC ATGTTCATGT ACAAAACGTT 11100 

CAATCATATC TTTAATTGTT GGTGATACGT GACTCAATAA TTCATCTGCA AAGACATAAT 1116 0 

is CAATCTTTTT CTCTTTAGCA TAATCAAATA AACTTTGCGT ATTACCACCT TGAGCAATAC 1122 0 

CAAGCACTAA ATAATCTGTG TTTTCATCTT CTTCTAAAAT TTGTCTAAAG CTCGTTCCAG 112 8 0 

ATGTAGATTG TTGTTCAGCA CGTGTATCAA AATGCGCATC AATATTTATC ACACCAATAG 1134 0 

20 ATTGTGTTGG ATAGACTTTA CGTGTTGCTA AATATTGAGC ATACGCAATA TCATGTCCAC 114 0 0 

CACCTAATAA AAATGTTTGT CT AT G ATT AG CAATTGACTT CGCTGCAAGC AT AG CAAATT 114 60 

CTTTTTGAGT ATCAATTAAT TCCTCATGAT CATGATAAAC ATTTC CGTAA TCGACTAAAG 11520 

25 

TTcACATTGA TTCAAATCCG GCAAACCTGC AAATGCTTGT TTAATCGCAT CTGGTCCTTC 11580 

TTTTGCACCA ATGCGCCCCT TGTTTAAAGC AACACCTTTG TCAACAGCAT AGCCTAATAT 11640 

30 ACCGACCCCT GATGGCATAC TACTCTTTTC CAG CTTAG AC AAATCTTCAA ATGTTACTGT 11700 

TTGAAAATGT CTAAATTTTT TCGGGTCTGT TTCACTATCT AACCTTCCAG TCCATAAATT 11760 

TGGTTCACCT TGCTTGTACA CAGCATTTCC CCCTCTTATT TATGTGGCTT ATTAACAATT 11820 

35 AAAGTATAAC GTATAGGAAA TTTTGAATTC AATTCATAGT TAAATCCGTA TCTTAAAAAT 118 8 0 

ACTTATCTAC ATTA C TTTTA CCCCTATTTT CTATGTAATA ACGAATACTT AGCTGATTTA 11940 

TGTTAATAAA ATACGTCAAG ACTATTACAT TTTCATTAAT ATTGACATAG ACAATTTATC 12 000 

40 

TCTCGGCTTG TAATATGTAT AATTGTTACT AAAAGATATT TTGCTTGTTA CCTAATGGAG 12 06 0 

GTTACATATA ATGAAGAACA ATAAAATTTC TGGTTTTCAA TGGGCAATGA CGATTTT CGT 12120 

45 CTTCTTTGTC ATTACAATGG CGTTATCCAT TATGCTCAGA GATTTCCAGT CTATAATTGG 12180 

TGTCAAACAC TTTATATTTG AAGTTACAGA TCTAGCACCA TTAATTGCTG CAATCATTTG 12 240 

TATACTCGTT TT CAAAT AT A AAAAGGTCCA ACTTGCAGGT TTAAAATTCT CAATCAGCCT 123 00 

50 GAAAGTAATT GAACGTCTAT TGCTAGCTTT AATTTTACCT TTAATTATTC TAATTATTGG 12360 

TATGTACAGC TTTAATACAT TTGCAGATAG CTTTATTTTA TTACAATCAA CAGGCTTATC 12 420 

AGTACCTATT ACACACATTC TGATTGGACA TATTCTGATG GCGTTCGTAG TAGAATTCGG 12480 
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TGTTGTTGGT TTGATGTATT CAGTTTTCTC AGCAAATACA ACTTATGGTA CAGAATTTGC 12600 

TGCTTATAAC TTCCTTTATA CATTCTCATT CTCTATGATT CTTGGTGAAT TAATTAGAGC 12660 

5 

GACT AAA GG A CGTACAATTT ATATTGCAAC GACATTCCAT GCTTCAATGA CATTCGGACT 1272 0 

TATTTTCTTG TTTAGCGAAG AAAT CGGCGA TCTATTTTCA ATCAAAGTCA TCGCCATTTC 12780 

AACAGCAATC GTTGCAGTAG GATACATTGG TTTAAGCTTA ATTATC CG AG GTATTGCATA 12840 

10 

TTTAACAACA AGACGAAACC TTGAAGAACT TGAGCCTAAT AATTATTTAG ACCATGTCAA 12 90 0 

TGACGATGAA GAAACTAATC ATACTGAGGC TGAAAAATCT TCTTCAAATA TTAAAGATGC 12 960 

'5 TGAAAAAACA GGTGTAGCTA CTGCATCAAC GGTTGGTGTT GCTAAAAATG ATACTGAAAA 13020 

TACAGTGGCT GACGAACCAA GCATTCATGA AGGTACTGAA AAAACAGAAC CTCAACATCA 13080 

CATAGGTAAT CAAACTGAAT CTAATCATGA TGAAGATCAt GACATCACTT CGGAGTCAGT 1314 0 

20 

AGAATCAGCn GaATCAGTTA AACAAGCACC ACmAAGTGAC gATTTaACAA ACGATTCAAA 13200 

TGAAGATGAA ATAGAGCAAT CATTAnAAGA ACCTGCGACT TATAAAGAAG ACAGACGTnC 1326 0 

ATCAGTTGTA ATTGATGCAG AAAAACATAT CGAAAAAGCT GAAGAnCAAT CTTCAGATAA 13 320 

A 13321 
(2) INFORMATION FOR SEQ ID NO: 5: 

so (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8549 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATGTGTTGTA AACTTTTATG TTGAAAAAGC TACTTATCTC AATGAAAACA AGTAGCATTT 6 0 

40 

AATAAATTAA TTAGTATACA GCT AG TTTTT CTAATTGTTC TTTAACTTGA ATTAAGTTTG 120 

ACCGTATTAG AGAGGCAGAT TGATCCATCG TTTGAATTGC TTGTCCTTCA TTTTCGTTCA 180 

45 AGCCATTACA AACAACTTCA AACTGTTGTG CCATTTGATC AAGACGCGCA TGAGCTTGTG 24 0 

TGTTTAAAAT AAACATATCG TCATAATGTG ATGGCGAATA GATAATTCGT CGTTGTATAC 3 00 

AAACGTATAA AAACCTTGTC ATATCAACGG TTTTGGCATT TTTAAACCTC TGTGTTTTCC 360 

50 ACGCATGTTT GCCCTTATTT AAATAATTTG CCCTTTTTTC GCCCCGAAAA AAAAACACAA 420 

AAAAATAACC ACACTCCTAA ATTAATAGGT GGTGTGGTTT TGTTGATTGT AGGGG TATAA 4 80 

AAATAACCGC ATTATTAAAG ATACGGTTAC TCTGTTATCT GTAAATATAA TAGTAGTTTA 54 0 
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AAACAGGACT CCACATAAAA ATCAACTCCT TTATATACCA TAATGATACT ATATTTTCTA 6 60 

GTTTATTTCA ATTTTTCAGT TTTTAAAAAT GAGTTTCTGT TTTTATTTAT ACGCTTTTCT 72 0 

5 

GTTTTCTTTT TAAATTTTAT CTTTTTGTTA TTCCATTCAT TGTAAAATTC TATTAAATTA 7 30 

ACATAAAATT TTTCATGCCC TATTTTATTT GTTGATGAGA TATCAATGTA AAGACTCAAT 84 0 

ATTGTTTTTA AATAGATTTG ATGCAACGAC TGATAAACCG TATTACTATC TGCTATGTTA 9 00 

w 

TTGGTAAAAT GCATAGAAAA ATATTCTAAT TTATTCATGC AATATATATG GGTTTCATTA 960 

TACTTCTTAA TGAGTGTATT TATACCTTGC AATACGTCAT TACTTTTAAT AACAATTTCT 102 0 

;5 TTTTCACCTG TCGAAAAAGT CCACTGTTTA TCTCCTATAT TTTCTTTAAT TGTTTTCTTG 1080 

TTGTCAAATT CTAAAATTAT AGCCCGTAAA CACTCTTCTT TATAATTCTC GTTCTTGAAA 114 0 

GTACGAAGCA AAATTTTTAT AAATTCGGTA TTGGTGACTT TTTTATAAGT GTGATATTTT 1200 

20 

GCAATCTCTT TATCAGTAAA GACTGTTCTT AGTTCGTGAT TATCAAAACT T AAATT CATC 1260 

TTATTCTCTA ATTCATTAAT TTTATCTTGC AAACCAACAT TTTCTAAAAT TTTCTTGTTT 132 0 

ATCTCCCCTA TATCAAAACT CCTTTTCGAA ATTAATTTTG AAAACTCGTC TGCCATTTCA 13 8 0 

25 

ACAGCCTTTT CTTTCCTTTT ATACCTTTTG TTAAATTTAT GAACCACCGT TGCAG CAT AA 144 0 

TACGATATCC CACCAGATAA AAT AG ATG a T ATTATCGGTA TGTATATATC ACCTTTCATA 150 0 

30 TTTCCACCTC TTTTAACACA ATTAAGTATT ATGATACACA ACTTGCGCAA AAAGATGTAG 156 0 

ACAGAACATA ATGGCGAACA AAAACAACCA CCCAGTAACT AGTATGGGTG GCGTAgACTA 1620 

TAACAACTCT ATGTTATCAA GATATATGTA TCGAGTGATG GCAAGGAAGA AGTCTCCTGC 1680 

35 GGGAC CAACA GTCAGATATA TGGCCTCTGC CGGGCTATAT AGTTCACTCC TACTATATAA 174 0 

AAG TAAGTAT AACATAAAAA GCACCCCGTA AACTG TT ATA CGGGAATGCT AAAGTCATAT 1800 

ATACTACGGG GAGTAGTATG AAAACTATGC TCTCTATCGT AAGAAAAAAC ACC CAGTGAC 186 0 

40 

ATGCTTGGGT GAACAAGGAT AGATGTAAAT AGTTGATGCA TGTGTAcACA TCATAACAAA 192 0 

AAACTAGCCC GAAGcTAGCT ATAACATAAA AAAATAGGCA AGTACCGAAG TACCTGCCAG 1980 

45 TTACGCACAT TTAAATCTTG AGAGTAATGT TAAAAAGTGT ATAGGAATAT TAACATCCAT 2 04 0 

CCAAATAGTT ATTTAATAAC TGTAAGATTC CCTATAATTA ATGTAGCaAA ATTTTT ATT C 2 ICO 

TAAGTAAATA CTAAATCGTG CTAAACTTAC CAAAACTACT TATTCTATTA CCTGCCTTGT 216 0 

50 CTACCTCTCC TGTCGCTATA TAACGACGTT GTCCACTATT AGCAATATAA GTAATCCATC 2220 

TAT AG CC ATT GATGCAATAT GCGCCGTCAT ATTTAATTGT TGCGTTATTA GGTAATACAC 2280 

CTGTAATTCT TGAATTAGTT GAATAGCCGT CCCTTACGTT ATTACCTTTA ACATTGGCAA 2 34 0 
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20 



CTGGCACTGG TGGATTTTTT TGGTTTTTAG CTGATGTTTT AA2ATTACCA GCTACCAAAC 24 6 0 

CACCTATAGG CTTACCATGA ATCGCACCGG CTATTAATTT AGAATACAAG TCATAGTTTT 2 52 0 

TCTTAATCCA ATCCATATCA TTTTTATTAG TAATAAAACC TAATTCAGAT AAACGATAGT 258 0 

TTATATTTAT TTCTGCTGAT ACATTAACGT TTAGTAAATC ATTACGAGGT GTTACACCTC 2 64 0 

TTATTTGTCC TAAGTTATTT TTAATAACAT CTTGTATACT TTTATCAATA GTATCTGCAT 270 0 

10 

TGAATTGACT TGAAATAATA ACATGCCCAC CACTTGCACT TTCTCCTGCT GCGTCTAAAT 27 6 0 

GAATCTCTAG AACAATGTCA TACCCATGTG ATTTAACCCA ATATAAGCCA TAAT CTTT AT 2 820 

1 5 TATTTCCTAC ATTAACACCG TAAGCAGTAT CTTGATACAT ATCTTGTGAT TGACTTGAGC 2 8 80 

CAC CAT AT AA TGCAACTTCG TGACCTGCAT GTCTTAAATA CTTAGCGATA TTTGGTGTTA 2 94 0 

TATATTTACG GATAAAATCA CGTTCATTTG TTCCGTTTCC GACTGCTCCA GGATCGTTAT 3 000 

AACCATGACC GGCTACAAGC ATAATTTTTT TAGGTTTAAT TACTGCTTGC TTTTTGGCAG 3 0 60 

TTGCTTGCTT AATAACGCTT TT AG CTTT AT CTCCAACACT TA CTTT AT CT GGG AAATTTA 3120 

ATCTAATAAA ATACATTGGG TCATCGTAAT AATGAACATG TCTTGTAACG GTTTCGGGAC 3180 

CCCAACCAGG TTGCGCAACG CCATTTGTCC AACCTTTACC ATTCCAATTT TGGCCAAACG 324 0 

ATGTGAAAGT GTTTAGATTA GCGCTCTCAA CAATTTCAAC ATGTCCaGct CCGCCACCAT 3300 

ACTTTGACGG GAAAACGACA ATGTCCAACT TTTGCGGTAA AAAGCTATCA TAGTTTTTAA 3 3 60 

TTATTTGCCC GTATTTTTCA ATCCTTGCTT TATTATCAAA TGGAATATTA TAAG CGTAT A 34 2 0 

AACCTTGTAA CcTTTCGCCT GTTGCTATCA TAAAAAACAT ATTTGCGTAA TCGTAACACT 34 8 0 

GAAATCCATA AAACAAATCA GGATTGAACT GCTTCCCTAA TGAATTATCA AACCATTTTT 3 54 0 

CTGCTTGGTT TTTTGTTATC AACATTGGTC AACACCTACC CTAAATCATT TGTGTCGTTC 3 6 00 

ATA'KTCGTAG GTGTCATTAC TTCTTTAATT GGCGCTTGCC CTGTTGCTTT TCTATACTTG 36 6 0 

TTTTCAGCTT TATATTTCTT TAGCTTTTGA TTTGCCCATT TACCTTCTTG AGATGTTGGA 3 72 0 

TTATCTTTAT ATGTAGTATA TAAAGCAACA ACTGTTAAGA TAATCGATGA AACACTTTCT 3 780 

^ TCATCTACTG GTATCGGACT TATACCTTTA TTCGCTAAAA ACTGATTGAC TAATG CTAAG 3 84 0 

ATCAATACGA TGTATCTTGT TATTACTTTT GCATCCATTT GTTTGCTCCT TTTAT CCAAA 3 90 0 

ATAAAAAGCC AGTGCCGAAG CACTGACTCT TAACTATTAC TT ACACTTA C TAAACCAGAA 3 96 0 

50 ACACGACCAA AAGCTATATC CTAAAATTCC CTTAAGCATG GTAATCACCT C CTTT AAATG 4 02 0 

CCAAAAATAG TTTTTAACAA GGCTATAACA AATGTACTTA GAATCGTCCC TATTAATCCT 4 08 0 

AGAATCCACA TCTTGATGTC TCTAATATTT TTAGCATTTT TCTCTTTATT TTTTTCATCT 414 0 
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TGCGTTCTCA GACTGTCTTC TATTCTGTCG AATTTTTCAA ACATAGTCTT ATCATTTTCT 42 6 0 

TCTAATCGCG TTAAACGCCA ATCTTGTTCG TGTCGTTTGG TAAATCCAAA CATTACACCA 432 0 

5 

CCCACTTTAT TCAAATTAAA AAGCCATAAG ATTATAACCT ATGACTCTAG ATTTTCTGGA 4 3 80 

TACTTTTCTC CTGTAATAAT TGCATATTCC TCTTTATCTA TAACTTCCAT ATCTACATAC 4440 

CACGCTATAT CTTCTTTACT ATATTCTTTC AATTGATACC ATGTTTTAAT ATCTTCGAAT 4 50 0 

w 

GTTGGTGAAA TTAATTTAAG CATTTTCAGT CTCTCCTTTA ACCTCTTCTA ATTTTTTATT 4 56 0 

AAGTGTCACA AGTTGTTTTG CCATTAGTGC ATTTTG CTT A TTAACTTGCA TCGATAACTT 4 6 20 

15 TGTACTTTGA ACAACTTGTT TCTGCATACT AGCAACCATT TTTCGTAAGA TGTCATCAGA 4 6 80 

AGCGACTGTG TTTTGTTCTT CACTGTCAAT CTGTTGATGC AAGTCATCTT TTTCTTCTGA 4 74 0 

ATAATCTTCG TTAAAAACTA TTTCCCCATT TGAATATTTA AAGGCTTTAG GTCTAAAAAC 4 8 00 

20 TTGAGAGAAA TTTTCTGGTA AATTTTCAAT ATCAATACCT TCTTCAAAGC CACCAATGAT 4 8 60 

AGCGTATGAA ATTATCTCAT TACGCTTGTT AACTAATATT TGCATTATTT TCTCACTCCT 4 92 0 

ATAATTTTGT TAATTGTCCC TCTATTTGCG TTCGCACCAG AGCCTCTTTG ACTTCCTAAG 4 9 80 

25 

TCGAAATAGA CATCGTTTGA TATAGTTAAA GATGTACGAC TAGATTTAGT TAATCCAAAC 504 0 

TCATAAACAC CTCCACCATT TCCATCACCA TCTGGAAGAT TTGAGGGATT CAATGAAATC 5100 

30 TTTCCTCCTC CAAAAGGACT GCCAAACTCT GTAAAGTCAC CAC CTGGAAA AGTCCCATAA 5160 

AAAATTAATA AAATAAATTG GTCTAAACTC TCATTTAAGT ACAATGTAGA GCCCACACCA 5220 

TTTGCTGTTC CATCAAAAAT AACCGAATAC CTTTTATTAA ACTTGTCATC TGCGTATAAT 52 80 

35 TTAGCGTTAC TTTCGGCCAT ATT AG CTTTT GATTGGGCAC TTTGAACAGT TTCAAAAGGT 5 34 0 

GTATTGTAAT CATTAATAGC TAATTCTGAC CACTCAGACC ATGAACCCGC TT CTTTT CTT 54 00 

TTAACAAATA CTTTATTTGT ACCGTTCGGT CGATAAGTCA TACGCTTGTA ATCTGAAGTT 54 6 0 

40 

ACTACTAAAT ATTCGACAGT AC CG TT AG T A CTAACACCTC TTGGATAATT TATAGCTTGC 552 0 

GAAACATAAA TAAATTGGGT TGAATCACCT ATTCTTTGTT CTGGATTATT AAAATCAAAT 55 3 0 

45 CCAGTAATCT GCATTATCTT ACCATCATCT TTAGTAATCT TAGCTTTTTG CCAATTTGAA 564 0 

GTAGAACCAC TTGTGACTAA ACCACCACTA TTCACTGACT GCTTGAAGGC TTCATGTTTC 57 00 

TCATCCATAT ATCGCTTTTG CTCATCGAAT GTTCTTGAAT ATGCTTGCGC TTTATTTTCC 575 0 

50 AAAT CAG AT A TATGG CTATT AGCAAGTTGC TTTAATTCAT CTATACTTGA AGATTTTGCT 58 2 0 

ATTTGAATAT CTGATAGACC TTTTTCTTTA GCTTTTTCAA TCAGACTCGC ATAATCTTCA 58 3 0 

CCATTTTTTA TAGCCTCGTC CATTGCTTTC GCACGATCCA TAATAGTTTT TTCTAATTCC 5 94 0 
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TCAACGTTAA ATGTCATACT TCTCTCGACA ACTACCACGT CTGAATTACC TAATTCTGCA 6 0 60 

ACCGAAACTT GAGCTTGATA ACTTCCATCT CGTTTAATTA CATCATTAGG TAATTGAAAT 6120 

TTTAAAATAC CTTTAAATGG ATCTAATATT TCTAGTGGAG CAACTACCAT G ACT C CTTT A 6180 

CCTCGAATCG CTATTCGTGC kTTGATATTT tCTTCACTCA ATAATAACGG TTGATTATTT 624C 

TTAGTGATAT TAAAAAGAAG AACAGAAGAA TCACTCTCTC CTGTTCTAAA AGTTATATCT 6 3 0C 

w 

AGATTTGAAA TATTTCCATA ATGCGCTGTG TTTTCTAAAT TTATAGCTAC AGATTTCTCT 6 3 60 

AAATTACTCA TTAACTTATA ATTCTCCCTT CGTGTAAAGT CCATGGCCCT GAACTTGTTT 64 20 

is TACTATCATA ATTTTTCAAT AGTATCTCAG CAGATGCTGT AACACTATTA CGAACTAGCC 64 80 

TATGAACAAA GCCACCTGTG TTTGAAGCTT CTACATATAA GTTCCAACCA GCTACCCCTT 6 540 

TACGTTCAGT TGGAAAATCT GTAAAACGTT TTGTATCATC CGTAGTTAAA TAAAACGACA 6600 

20 TGCCTACTAT GTTAATATCT GACATTTTTG TGATGAATGA AGGTACTCTC T C C CATTT AC 6 6 60 

CACTATTTTT AGGCACATAA TTCCAGTCCG AAATGTCTCC AGTTCTTCCA GAAAGCACCC 6 720 

TTTCAAAAGT CATCATATTC CTTGCATAAC TATTACGCGT CAATATCTGA ATTACATCAC 6 780 

25 

CGCCAGTTTG TGGTGGCTTA ACTTCCAAGA ACCAACCTGC ATCACGCCAT TCTCTTGGTA 6 84 0 

ATGGGAAATC ATCGATTTGA ACTGTATGAT CAGTGTATAA ATAGTAAAGA CCTGGCTCTG 6 900 

30 TTAACATCCC AAGATTCTTA AGTTTATCAG GCCTCATTGG TAAAGGTTTA ACTCTACCAC 6 96 0 

CTGTGTCACT CaTGATAAAA GGAACGCCTC TTGAGTGAAG T ATTT CT AAA ATACCTCTTT 7 02 0 

GCCCAATCAT GAAAATACGA TGTGTTCTAT TTCCaTCACC ACCGACAGTA ACACCTAGCA 7 080 

35 TCAAAGCTTT TTTACCACTA TCTTTGTCAT AGTATATTTG CAAACCTTtC TgCTTCCGCA 714 0 

AATTCGCCAG GAAATGAATC tAgTGTTCCA CCATAGTCAG CATTAACCTG ATACGCTTCT 7200 

TCTCCTGTTT CTAAATCGAA AG CCGTT AAA TAGTTTCTAT TATTTGGATT ACTGTCTCCT 726 0 

40 

GTATACCAAT ACAAGTATTT TTCATCAAAA GTCACACCCT GCATTGGTTG GGTTTCGTTT 73 2 0 

GTTAGTCTCA TAGGGATACT GATTTTATGC AAAACTTTAT CAATATTTTT ATCAACATCG 73 80 

45 TCTAAACTTC TTATCTCTAT ATAAnTCATT GAGTTTTCAA GTTCCCACTG ACTT CTAGG T 74 4 0 

CTCTCaATTC TGTATAGAAT TTTATTTTCT TTTTCATTTA TGACAGGGGT GATGTAGGGT 7500 

TTTTCTGGGT GTCCTGTAAA TACATCTTGC ATACCATACT TGCCATAGCT AATTTCCACA 7 56 0 

50 TTAGGCGTAT ACT TG AAA CG AACTAATGTA TTCTCATTAT T AC CATTT AA GATAAAACTA 7620 

TAAATCCATA ACTCATcATC AATATATCTA TAACCGTTAT GTGTACCATG ACCCCCACCT 76 80 

ACAATCAATG AGCTGTCTAT AAATTGACCA TTAGGTCTTA GACGACTTAG CAT AT AG C CA 77 4 0 
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20 



25 



40 



ATTACTGCAT TTGTAAgAGG TGCAAGTTCT GTCACAAATA AAAATT CTTG CTTATCAGGT 7 860 

TCAAAACGAT ACTCGATATC AAGAATTTCT TGTTTGGTCT TATTTAATTC TCTTATAGTT 7 92 0 

TCCTCTTTAT TAATTTGAGT TTTGGTTTCC CAATCGTCTA AATGTTCTTT TAATGTGTCA 7 980 

AAGGTTTCGC CGTTTACATT AACTCGAGCT TGAACAATCT CATTAGCACT GTTATTACGT 8 04 0 

GGTGCCACAA CAAGTGCGTT AATTTGACTT TGTAAAGATT TGTTTACTGC TGCTTGCGAT 8100 

CTACCATTAT AATAAATTTG CTCAGCGAAG TGTTGAATTG TTTTAGCTyT CTGATGCAAC 816 0 

TTAAACTCTG TTGTCAAGCC AAGCGCAAAT TGCTCTATTC TTTGTAAGTT TTGTATTTCC 822 0 

TTAGCTCTAT AATCTCGACC TGCTAAAGCT CCCAAATCCT TTATTAAATA CAAATTTTCC 82 8 0 

ATAATGCACC TTCCTTTCTA ATAAAATAGC ACTGTACCAA GTTTCCCACT ATCGTCAACT 8 34 0 

GTTATTTTCC ACAATTTACC GTTTGGGGAT TTCTGTACAA TGCTATTTTG AATAATTgcC 84 0 0 

TGctTCGCCT ATTTTTAAAT TAT CTAATTT ATTTkTATCA TTTACCGAAA TGATACCGTC 84 6 0 

TTGAGGCAAT CCATCAATAn CACTACTGCC TGCATAAGGT ATCCCATTTA TAG CTTT C CA 8 52 0 

ATGTGTAGCT GGAAAGTACT GTTTATCGT 8 54 9 
(2) INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3601 base pairs 
30 (B) TYPE: nucleic acid 

tC) STRANDEDNESS: double 
(D) TOPOLOGY: linear 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6: 

AGGCGTGTAG TGACTTACGG nTAGGAAACT ATGTATCCGA ATGATTTATT GAGACCAAAA 6 0 

AGGCATTAAA GTCCATTGAA ATATCnGGTA GCGmGTTGGT ACgTGGACGT GGGGGCCCTA 12 0 

GATGTATGAG TCAACCATTA TTCAGAGAGG ACATTTAACG TAATAAATTA TAGAmACGAG 180 

GGTGAAAATA ATGACAGAAA TTCAAAAACC GTATGATTTA AAAGGCAGAT CATTATTAAA 24 0 

45 AGAAAGTGAT TTTACCAAAG CAGAATTCGA AGGACTTATT GATTTTGCAA TTACATTAAA 30 0 

AGAGTATAAG AAAAACGGTA TTAAGCATCA CTACTTATCT GGAAAAAATA TTGCACTACT 360 

ATTCGAAAAG AATTCGACGA GAACGCGTGC TGCGTTTACA GTTGCGTCTA TTGATTTAGG 42 0 

50 TGCGCATCCA GAATTTTTAG GAAAAAATGA TATTCAATTA GGCAAAAAAG AATCTGTAGA 4 80 

GGATACTGCG AAAGTATTAG GTAGAATGTT CGATGGTATT GAATTCCGTG GTTTTTCACA 54 0 

ACAAGCTGTT GAAGATTTAG CGAAGTTCTC TGGTGTACCG GTGTGGAATG GATTAACAGA 60 0 
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TCTAGAAGGA ATAAACTTAA CTTACGT7GG AGATGGACGT AATAATATTG CGCATTCATT 720 

AATGGTAGCA GGTGCTATGT TAGGTGTTAA TGTAAGAATT TGTACACCTA AAT CATT AAA 7 80 

TCCAAAAGAG GCATATGTTG ATATTGcAAA rGAAAAaGCG AGTCAaTATG GTGGTyCAGT 640 

CATGATTACG GATAATATTG CAGArcCAGT TGAAAaTwCm GATGCTATAT ATmCAGATGT 900 

TTGGGTATCG ATGGGTGAAG AAAGTGAATT TGAACAcGTA 7TAATTTATT AAAAGACTAT 960 

CAAGTGAATC AACAGATGTT TGATTTAACA GGTAAAGATT CAACGATATT CTTACATTGT 1020 

TTACCAGCAT TCCATGATAC AAATACACTT TATGGACAAG AAATTTATGA AAAATATGGA 1080 

TTAG CTGAAA TGGAAGTTAC AGACCAAATC TTTAGAAGTG AACATTCAAA AGTGTTTGAT 1140 

CAAGCTGAAA ATAGAATGCA TACAATTAAG GCAGTAATGG CAGCAACATT GGGGAGTTAA 12 00 

TCACTAAATG GAACGATATG AATATGATGT GTCTGATGAT ATAAGTGTCA TGTACAGACA 12 60 

CCTCATATTG GTATTAAAGG AGAAATGAAT ATGAACGAAT CAGGAGATAA CAAACTCAGT 13 20 

AAATCTTCTT TAATTGGACT AGTTATAGGA TCCATGATTG GTGGCGGTGC GTTCAATATA 13 80 

ATGTCTGATA TGGGCGGTAA AGCCGGTGGA TTAGCCATTA TTATTGGTTG G ATT ATT A CA 14 4 0 

GCTATAGGAA TGATTTCATT AGCGTTCGTA TTTCAAAATT TAACCAATGA ACGGCCGGAG 15 00 

CTAGACGGTG G T ATTT AT AG TTATGmTCAA GCAGGATTTG GCGATTTTGT AGGATTTATC 1560 

AGTGmTTGGG GATATTGGTT CTCAGCGTTT TTAGGCAATG TTGCCTATGC AACACTATTG 1620 

ATGTCAGCAG TAGGTAACTT TTTCCCGATT TTTAAAGGAG GCAACACATT ACCAAGTGTT 16 8 0 

ATTGTCGCCT CGTTACTACT CTGGGGTGTC CATTTCTTGA TTTTAAAAGG CGTTGAAACA 174 0 

35 G CAG C ATTT A TCAATAGTAT TGTTACTGTT GCAAAGTTAA TACCGATTTT ACTTGTAATC 180 0 

ATATGCATGA TAATTG CATT CAATTTTGAC ACTTTTAAAA CAGGCTTTTT CAGTATGACG 186 0 

TCAGAGGGTG TATTG C CATT TAGTTGGGCG AGCACAATGA GCCaaGTtAA AAGTACGrTG 192 0 

CTAGTGACAG TTTGGGTGTT TATCGGTATC GAAGGTGCAG TAATTTTTTC TAGTAGAGCT 198 0 

nAAAATGAGA AAGATGTAGG TAGTGCCACG GTTATAGGAC TTATATCAGT TTTAATTAT C 204 0 

TATyTCTTAT TAACTGTATT AGCTCAAGGC GTGATTTTGC AAAATCATAT TTCGCAATTA 210 0 

GATTCGCCAA GTATGGCACA GGTGCTTGCA ACTATTGTAG GTGGTTGGGG AT CT ACACTT 216 0 

GTAAATATTG G TTTAATTAT TTCGGTACTA GGTGCATGGT TAGGATGGAC ACTGCTTGCT 222 0 

GGTGAATTAC CTTTCATTGT TGCAAAAGAT GGATTATTTC CAAAATGGTT TGCTAAAGAA 22 8 0 

AATAAAAATG GAGCACCTGT AAATGCACTG CTTATTACCA ATATATTAGT ACAATTATTT 2 34 0 

TTAATAAGTA TGCTATTTAC ACAGAGTGCG TATCAATTTG CATTTTCACT AGCATCAAGT 24 0 0 
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CGACAGCAAG CAACTACTAA ACAATGGACG ATTGGTATCA TAGCCTCAAT TTATGCTATA 2 52 0 

TGGCTTATAT ATGCAGCAGG TATCAATTAC TTATTATTGA CGATGTTACT TTATATTCCA 258 0 

5 

GCTCTTCTTG TTTATACaAT CGkTCmAAAG rATwATCAGa CACGTTTGAT TAAATCAGrC 264 0 

TATATTCtTT TTATGATTAT tATCGTACTT GCAGTTATCG GGTTAATTAA GTTATTGATG 270 0 

w GGAACGATAA ATGTTTTTTA AAAGGAGCGA CAAAAATATG AAAGAGAAAA TTGTCATTGC 276 3 

ATTAGGCGGT AATGCGATAC AGACAACAGA AGCAACAGCT GAAGCACAAC AAACAGCTAT 2 82 0 

TAGATGTGCG ATGCAAAACC TTAAACCTTT ATTTGATTCA CCAGCGCGTA TTGTCATTTC 238 0 

' 5 ACATGGTAAT GGTCCACAAA TTGGAAGTTT ATTAATCCAA CAAGCTAAAT CGAACAGTGA 2 94 0 

CACAACGCCG GCAATGCCAT TGGATACTTG TGGTGCAATG TCACAGGGTA TGATAGGCTA 3 00 0 

TTGGTTGGAA ACTGAAATCA ATCGCATTTT AACTGAAATG AATAGTGATA GAACTGTAGG 3 06 0 

20 

CACAATCGTT ACACGTGTGG AAGTAGATAA AGATGATCCA CGATTTGATa ACCCAACTAA 312 0 

AcCAaTTGGT C CTTTTT AT A CGAAAGAAGA AGTTGAAGAA TTACAAAAAG AACAGCCAGA 318 0 

CTCAGTCTTT aAAGAAGATG CAGGACGTGG TTATAGAAAA GTAGTTGcGT CACCACTACC 324 0 

TCaATCTATA CTAGAACACC AGTTAATTCG AACTTTAGCA GACGGTAAAA ATATTGTCAT 330 0 

TGCATGCGGT GGTGGCGGTA TTCCAGTTAT AAAAAAAGAA AATACCTATG AAGGTGTTGA 3 3 60 

30 AGCGGTTATA GATAAAGATT TTGCTAGTGA GAAATTAGCA ACGCTGATTG AAGCAGATAC 34 2 0 

CTTAATGATT CTTACGAATG TAGAAAATGT ATTTATTAAC TTTAATGAAC CTAATCAACA 34 8 0 

ACAAATCGAT GATATTGATG TAG CAACACT GAAAAAAtAC GCGGCACAAG GTAAGTTTGT 3 54 0 

35 

GGAAGGATCG tGTTGCCAAA AATAGAAGCT GCGtACgtTT GTTGAaAGtG GGGaAACCAA 36 0 0 

A 3601 
(2) -INFORMATION FOR SEQ ID NO: 7: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
45 (D) TOPOLOGY: linear 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
50 CG AC ACT ATT AAATGAATTA GAGCACAATC 7AACAAATCA AATTCATTTT T CAAAAG ATG 6 0 

AACGACTCAC ACATATCGCT TTAAAGTTAT TCGAAACAAC CG ATCCTGTT TCAACAAAGC 12 0 

AACTTGCGCA AGATGTTAAT GTTTCGCGTC GGACAATTGC AGATGATATT AAAATGATTC 18 0 
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TTATTGGTGA GGAAGATCAT TATCGTAAAG CGTATGCACA CTTTATACAT CAATATATGA 3 00 

AACAAGCTGC ACCTTTTATA GAGGCGGATA TCTTTAATTC AGAATCAATC GCATTGGTTC 360 

GCCGTGCCAT TATTAAGACA TTAAATAGTG AAAATTATCA TTT AG TTCAG TCGGCTATCG 420 

ATGGCTTAAT CT AT CAT AT A CTCATTGCCA TTCAGCGTTT AAATGAAAAT TTTTCGTTCG 4 80 

ATATACCTAT CAATGAAATT GATAAATGGC GACATACTAA TCAGTATGCn ATTGCTTCAA 54 0 

AAATGATAGA AAACTTAGAA CGCAGTGTAA TGT 573 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TTGATATTTA TAACGTTATA TTTTAATAGT TCACCTGGAT TATTAAATAA ATAGTCCGCC 60 

AAATTTTCTT TTT C TTT AT C AATCTGaTXG TAATTAACaC TTTCGaCTTC TGTAGGAATT 120 

CTAATGTCAA CAGAAG CATT G ATATAAG CT TGATGTTGCA TGCAATCACA CTCCTAATCC 180 

TTCATmTmAA ACGGAGAAGT AAACCCGTCA CTATTCAAAT TCAATCCTTT TGCCCAATCA 240 

ACAGGCTTAT TCATGATAGT TTCGATTTCC TTAAGTCCAT TTGAACCTCT AGGTATTTCT 3 00 

ACAATTACTT CATCATGGAC ATGGCCAACT ATTTTAAAAC CTAATGCTTC AAGCCTTGCT 3 60 

35 ATAGAAATCG CAAGTAAATC CCTTGCAGTT GCTTGAACAA TATTCTCGAC TAACTTCCCA 4 20 

CCATACGTTT TTAACTTTGA CCATTTACGG TTAAGATCTA ACCCCATAAA TTCAACAACT 4 80 

TGACTACCCC AACTATTTTC ACCAACTAAA GCTTTTGGAT AAGCTAAAGC TCTTCCACTA 540 

GGCAGTTCAA TCATTAGAAA ACCTTTTTTC ATATAAAATC TAAGTCCATG TGTATGATGC 6 00 

GTCTTTCGGG ATTTTACAGT ATTAATTGCA GCCTCTTGGC AAGCCTTCCA AAAATTAACT 6 60 

ATGTTAGGAT TTGCGTTACG CCAACTATCA ACTAAACCTT GTAACTCGTT TTCTTCAATG 72 0 

CCCATTTCCA ATGCACCCAT TGCTTTTAAA GCTCCAGCGC CACCTTGATA GCCTAAAGCT 780 

AATTCGGACA CTTTTCCTTT TTGTCTGAGA GGGTCGCCTT TAGTTATGCT TTCTACCGGT 84 0 

50 ACATTAAACA TTTGAGAAGC CGATGCTTCA TATATCTTTC CGTGTGTGTT GAATACATCT 900 

AAACGC CATT GTTCTTTTGC ATACCATGCT ATGACTCTTG CCTCTATTGC AGAAAAATCA 96 0 

CTTACTGCTA GTTCATTACC TTCTTCAGCA GTAAATGTCG TCCTAACTAA TTGACTTAAT 1020 
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AGATCTCTTG CTATTTCTAA TTCAGTATCT GAAATATAAT GCTTTGTTAA ATTCTGAAGT 114 0 

TGTACACCTC TACCTGCCCA TCTTCCAGTA CCGGCACCGT AAAATTGAAA CAGACCTCTT 12 00 

ACCCGTTCAT CACTGCACAT C 1221 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 9: 

TTTTGTTTGG TATGAGGTAG CAATGACGAC GTGTCATTGG TGGAGATTGT AAAAATACAT 60 

AATAAAAAGA AG CGG CAATG TATACCGCTC CTTTTTTATA CTACATACCG ATTTTCAACC 120 

ATCTCTTTCT ACTTAGTAAT AAGACAATAG TATTAACTAT AAATAGAAGA ACGAAGAATG 180 

ATACTATATT TATAATTTCA GTAGGACACA TAAATGTTGA CTCGTTATTC AATATTTTTT 24 0 

CTACGGCACG ATACATCGTA TTGCTCGCCT CAAATGGAGC AACGATACCA AATATATTTT 3 00 

TATTAATGGC AACTAAGATG ACTGAACCAA TCCAATATAC AATGCTGATA CCTAAGCTGA 3 60 

TTAAAATGTT AGGTGAAACC ATACTAATCG TTCCAACAAC TAAGATATAT TGTAAGATAA 4 20 

CGAGTGAAAA TAAGATTATT AATAGTAAGT AATGTGAGAA ATCCGAATAT ATAATTGAAA 4 30 

TAATAGTGAT ACTTAGAATT ATGAACACTA AACATTCAAA AAATAACACT GCTACCTTTT 54 0 

35 TATAGAAGAA GGTAAAGATA TTATCGCCAA TCAATTTATA AAACAGGATA TTTTTATTCG 6 00 

AATACTCTTT ATTAATAAAA TATGCAATAA CAAATGAAAA TAGTAAGAAC CCTAATTGCG 66 0 

TTGCAACAGT ATATGAACTG AAGAAAAACT GGCTATAGCT TAAACTTTTA ACTTTGTCTA 720 

■to 

TACCTATTGG TAAAAAATAC CCAAGTAAGA AAAGGAATGT GAATAGCACA ACAAGCGTGT 7 30 

AAATAATTTT ATTGGAAATA CTTTTTTTAA ATTCTAATTT CAAAGTGGAC ACCTCAATTA 84 0 

45 TAAATTAATG TAATCATTTA TGACTTCTTC TTTTGATTGG TACTCTTCTA TTTGAAGGTC 900 

TTTAAAAATA AAGTATTTAC CCGGCAAAGC ACTTAAATCG GATAAATTaT GTGTAATATT 960 

GATAATAGTT TTAGTTTGAT GGCTTTGAAT AAAATCATTT AAAAATTCAT AAATTTCATT 10 20 

50 AACTGTTTTC TTGTCTAAAG CGTTTGTAAC TTCATCTAAT ATGATTAAAT CATGATCTTC 10 80 

CAATAAGAAA 10 90 
(2) INFORMATION FOR SEQ ID NO: 10: 
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(A) LENGTH: 904 base pairs 

CB) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

io TTAGGACTAT TTTATCATAT TCATTTAAAT TACGGCTAAA AATTTTAAAA ACGGGGATTA 6 0 

ATATATGGAA TTAAGCTATG AAAGTTAATT GATACTTGCA TTTTACGCTG ATTTATATAA 12 3 

GAATAACTAT TGTATAGTTT TAAAAACGAA CGTACGTTTG CAGGAGGCGA AATCATTGGC 180 

15 

AATGAATAAA CAAAATAATT ATTCAGATGA TTCAATACAG GTTTTAGAGG GGTTAGAAGC 24 0 

AGTTCGTAAA AGACCTGGTA TGTATATTGG ATCAACTGAT AAACGGGGAT TACATCATCT 3 00 

AGTATATGAA ATTGTCGATA ACTCCGTCGA TGAAGTATTG AATGGTTACG GTAACGAAAT 360 

20 

AGATGTAACA ATTAATAAAG ATGGTAGTAT TTCTATAGAA GATAATGGAC GTGGTATGCC 4 20 

AACAGGTATA CATAAATCAG GTAAACCGAC AGTCGAAGTT ATCTTTACTG TTTTACATGC 480 

25 AGGAGGTAAA TTTGGACAAG GCGGCTATAA AACTTCAGGT GGTCTTCACG GTGTTGGTGC 54 0 

TTCAGTTGTA AATGCATTGA GTGAATGGCT TGAAGTTGAA ATCCATCGAG ATGGTAATAT 600 

ATATCATCAA AGTTTTAAAA ACGGTGGTTC GCCATCTTCT GGTTTAGTGA AAAAAGGTAA 66 0 

30 AA CTAAG AAA ACAGGTACCA AAGTAACATT TAAACCTGAT GACACAATTT TT AAAG CAT C 72 0 

TACATCATTT AATTTTGATG TTTTAAGTGA ACGACTACAA GAGTCTGCGT TCTTATTGAA 780 

AAATTTAAAA ATAACGCTTA ATGATTTACG CnwGGgTAAA GAGCGTCAAG AGCATTACCA 84 0 

35 

TTATGAAGAA GGGAt CaAAG rGTTgTTAGT atGTCCAaTG Ar GG AAAAGA AGTTTTGCCT 900 

GACG 9 04 
4Q (2) INFORMATION FOR SSQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
4 5 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

SO 

GATT7CTAAA TCAAGATCTG TTTTACGATA ACCATTCAAA CCTTGACGTT CATCTTCTTC 6 0 

AGGTTGATTT TGTTGCTGTG TGTCTTTGTT GTCAGAAGTC GCTACTGTTT TTTTATT AT C 120 

TGTTTCTTTA GTCATAACAA ACGCCTCCGT TATAAAACGC TATATTTAAT GATATGTGAT 180 
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TTAATAAGAC GATTCAGCAA GTTTTAAAGT 
GTCCTATAAT ATCACTGACA TTGTCAAAAT 
* TGTTCATTTT CTTAG TTAAA CCTGGGCCTT 

GTGACGCACC GTGACGCATC ATTTTGATTG 
CTATAATTAA AAATTCACCA TTTGTTTGCT 

w 

GTTCAAATAA TGGACGACCA CTCAAACCGC 
CATCTCGTTG TCGCGTTGTG TTTGCTAAGA 

, 5 GTAATAGTGC TTTTAAGCCA TCGAAATCCA 
CTGTTACATC ATGTTGTTTT 7TAAATGCTG 
CTTTATCATG GAAGTTTTGA AGATTTTCAG 

20 ATGAAACGTC GTGTTTAAAC GTATCAATAA 
AAGGTGTCAT TTTATTCACA CCAACATTGA 
GCAAATGACT TAGTGCTTTG TTCATACCAA 

25 

CGTCATCTTC TAATAATCTA AACATGCGTG 
TGATACCACC TAATTCTAAA GCACCGAATC 
AAGATTTGTC GAAACCAGCT GCTAAgCCAA 

30 

TTTGTGATAA CGTTGGATTC TTATAAGTAA 
GaAACTTTTG TaACGTTTTT AATGCATCGA 

35 TTTTGAATAA G AAAGGTT T A ATTAATTTGT 
GAGGCTTACT ATCCTCAACT TAATATATGT 
CCATACATAA TTTCCTAGTT AAAACTAAAA 

40 CGTTTTTAAG ATTAAATCAT CCTAATTAGG 
AAGGTGTTTG TATGAATGAA CAATGGTTAG 
TTTCACCAGT GAGTGGTGGT GATGTAAACG 

45 

CATTTTTCTT ACTTGTCCAA CGTGGACGTA 
GTTTAAATGA ATTTGAACGT GCAGGTATCA 
so TTAACGGTGA TGCGTATTTA GTGATGACGT 
GCCAATTAGG GCAACTCGTA GCTCAATTAC 
GCTTCTCATT ACCTTATGAA GGTGGCGATA 

55 



A7TATTTGAC TATGTTGGAT TAGGCATCTA 3 00 
GATGATCTTT TAAGTAACGT GCOATGCCTT 3 60 

CAATAACAAG TGATGAATAA ATTTGAATAA 4 20 

CAT CTT CAST ACTGAATACG CCGCCTGTAC 4 80 
GATAAgCATa CTTAATCAAT TTTAAATTAC 54 0 

CTTCTTCGAC 7TTATTAGCA GAAGTTAAAC 60 0 

TGATACCGTC AAATGTCTCA GTAATCGCTG 66 0 

TATCAGACGT 7AGTTTTAAA TAAATTGGCA 72 0 

TTAAAGCTTG GCATAACATT GAAAATTCAT 78 0 

TATTTGGAGA ACTGATGTTG ACTGTGAAAA 84 0 
CCTTTATATA ATCTTGATAA CGCGCTTCAT 90 0 

TACCAACAGG TACTTGATAA GCATTTTTAC 9 60 

TATTATTGAA GCCCATTCGA TTTATCAAGG 10 20 

GTTGAGGGTT ACCCGGTTGA GGTTTAGGTG 108 0 

CAAGGTGTTC CAATGCTTTT GGTACTTCGC 114 0 

TTGGATTGTC GTACGTATTA CCTTGTATCG 12 0 0 

ATAGTTTATC GACGACTGGG AATAAAACCG 126 0 

TAGTTAGTCC GTGTGCTTTT TCGGGTTCGA 132 0 

ACATGAGTAT GCTCCTATTT CATTATATTT 13 80 

GAAATATATT CTTTTAATAG ACTAGCATTT 14 4 0 

AGTTTTGAAA ATTGACGCAA gTTTGAATAA 15 00 

CAATATTATA GTATAAAGTA AGTAGATTGG 15 60 

AGCATTTACC TTTAAAAGAT ATTAAAGAGA 162 0 

AAGCATATCG AGTCGAAACA GATACGGATA 16 8 0 

AAGAATCATT TTATGCTGCA GAAATTGCAG 174 0 

CGGCACCTAG AGTAATTGCA AGTGGCGAGG 13 00 

ATTTAGAAGA AGGGGCTTCA GGGAGTCAAC 13 60 

ACAGTCAGCA ACAAGAAGAA GGCAAATTTG 192 0 

TTTCTTTTGA TAATCATTGG CAAGACGATT 198 0 
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GGCTATGGGA TGCCAACGAT ATCAAAGTAT ATGACAAAG? GCGACGTCAA ATTGTGGCGG 2100 

AATTAGAAAA GCATCAAAGT AAACCGTCTT TATTACATGG TGACCTATGG GGTGGTAATT 2160 

5 

ATATGTTCTT ACAAGATGGT CGTCCGGCGT TATTTGATCC AGCGCCATTA TATGGTGACA 222 0 

GAGAATTCGA TATCGGTATT ACAACGGTAT TTGGTGGTTT TACGAGCGAA TTTTATGATG 2280 

CGTATAATAA ACATTATCCA CTCGCAAAAG GTGCATCCTA TAGACTTGAA TTTTATCGTT 2 34 0 

w 

TATATTTATT GATGGTCCAT TTATTGAAAT TTGGTGAGAT GTACCGTGAT AGTGTTGCGC 24 00 

ATTCTATGGA TAAGATTTTA CAAGATACAA CAAGTTAGTT AAGACGTTAG ATTGAGATAA 24 6 0 

is ATAGATAATA TGCACAGATA TTTTTACAAT GAGAAGCGAT ACAGCTGCCT CAATAAAAAT 2 52 0 

ATTTGTGCGT TTTTATTGTT GGAAAATAAA ATTTTAATCG CTATTGTTAA TTTCTGTAAT 2580 

GTAAAACAAG GTTGAGTTAC AATAAAAGTG ATTTTATAAC TTTTTGTTCA ATAAAATTCT 2 64 0 

20 

AGGAATGATA CATATTTATT GATACAATAA TTTTGAATAT AATCATAAAA CAATATTTAA 2700 

GTATAATTGA ATGTTTGAAT AT CAT AT ATT GATACAGTTT CTAATAATTT TAAAATAATT 2 76 0 

TAAATGGAGA GAGGTGTAAA TGATGAGTAC AGTTCAAAGT GATATTTTTA AGACCAATAG 2 82 0 

25 

TGCATCATCA TCTATTAAAA GCGCTGTTGA AACATGTAAT AATGTGTCGA AACCGGATAA 28 3 0 

AGATGAAAGT ACAACAGTAA GTGGAAATAA TAATGCTCAT AGTGTGATAG ATGATTTGAT 2 94 0 

GAGTAAGAAT CAATCTGTTG CTGAAGCAAT ACGAACTGCG AGCGATAATA TACAAAAAGT 3 000 

30 

TGGTGAGGCT TTTGACCAAA CTGACGTAAT GATTGGTAAT GAAATTGGTA AAAATTAAAA 3 06 0 

CGTGGTG AAA TGATGTCGAA TAAACTGGAT GAAATCAATA AAATAATCAC AGCGAAACAT 312 0 

35 GAGCAAATGG ATGACTTATA TGATGAAAAG CGAGAGGTTA AAGCATTGAT AGATGAAAGT 313 0 

GATG CGCTT A ATCATTCGAT AGATCAATTA TATCAACATT TAGGTGAGCG TTATTATAGT 3 24 0 

AGCAATATGG CTAGTCGTAT GGAACAGTTC CGCGATGAAT TTCATTTTGC GAAACGACGT 33 0 0 

40 TCAACGGAAG CGTTATACGA GCAGCAACAG CAAATTCAAC ATGGCATTCG TAAAGTGGAA 33 6 0 

GAAGAGATGA TTGACTTGGA AATGCGAAGG AATGTTGAAA TTGAGACGGT GACAAAGGAG 34 2 0 

GAAAATAAAT GGAAACAATA GGAAGCATTA TTTATTTAAA AGAAGGTTCG CAAAAGTTAA 34 3 0 

45 

TGATTATTAA TAGAGGmCCA aTTGTAGAAA TTGAAAATCA AAAGTATATG TTTGACTATT 3 54 0 

CTGCATGTAA ATATCCGATT GGTGTTGTAG AAGATGAAAT TTATTATTTT AACGAGGAAA 3600 

50 ATATAGATTC AGTTATTTTT AAAGGTTATT CTGATCAAGA TGAGGTTAGA TTTCAAGAGT 366 0 

TGTTTGAAAA TATGAAACAA AATTTGGATA GTGAAATACA ACGTGGAGAA GTTACACAAC 3 72 0 

AATAAAGAAA TACTTTTTCT TTATTGGGGT GGGACGACGA AATAAATTTT GTAAAAATAT 3 730 
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ATGTCATTCA TAATCATTTG AACTAAACGT 
ACCAACTTCC GAAATGTAGA TGAATTCTCT 
AAATTTCTCA AGGATAGGTC TATACTTTAT 
AACAATAACT TGAAATAGAT CATTGAGGGA 
TTATcGGTGC CGGTGCTGCA GGTATAGGTA 

10 

CAGATGTCAT TATTTTAGAA AAAGGAACAG 
CGACCCGTAC GATCACGCCA TCATTTACGT 

15 CAATTTCCAT GGATACTTCA CCAGCATTTA 

CATATGCTGA ATATTTACAA GTGGTTGCCA 
CAGTTGTCAC AAATATATCT GTAGATGATG 

20 TATATCACGC GGATT AT AT C TTTGTCGCAA 

TTAAATATGG TATTCATTAT AGTGAAATTG 
ATGTGGTTAT CGGAGGTAAT GAAAGTGGCT 

25 

GCTCTGACAT CGCACTTTAT ACTAGCACAA 
GTGTTAGATT GTCACCTTAT ACACGTCAGC 
GCATCGAAAT GAATGTACAT TATACAGTTA 

30 

ATATCACTTT TGATAGCGGA CAAAGTGTGC 
GCTTTGATGC AACAAAAAAT CCAATCGTTC 

35 TTAAATTAAC AACACATGAT GAATCGACAC 

CAGTTGAAAA TGATAATGCC AAATTATGCT 
TACTTGCACA TCTTTTAACA CAGCGGGAAG 

40 ATTATCAAAA AAATCAAATG TATTTAGATG 

GTTAGAAGTG AAATATGATA TGAGAACTGG 
TTATTTGGTT ATTAGTCATG CGGATAAACT 

45 

ATTAATCATA AAACAGAAAT TAGATATTTC 
AGCGAGTGAA CATGTGATAG AACAATTGAC 
50 ACCTAAAATA AGTGCGACAT TTTTAGCCTG 

AATCGGTATC GCTTATCAAT TTTCAGATTG 
AGAATATTTA ACTCAAACAA CATTGCTCAA 

55 



AGCAGCCTTA AATTTTAAAA AAAGACACAT 3 900 

ACAATAACGG AAGTTTTTCT TTTAATATTG 3 960 

AAATCGTAAT TATTACGATT TATAATCAAA 4 02 0 

GTGTTAATAT GCAACATCAT AAAGTGGCTA 4 08 0 

TGGCCATTAC CTTAAAAGAT TTCGGTATAA 414 0 

TAGGACATTC ATTTAAACAT TGG CCG AAAT 4 200 

CTAATGGATT TGGCATGCCT GATATGAATG 4 260 

CATTTAATGA AGAACATATT TCCGGAGAAA 4 3 20 

ACCATTACGA GCTGAATATC TTTGAAAATA 4 3 80 

CATATTATAC GATTGCAACG ACAA CAGAGA 444 0 

CAGGTGATTA TAATTTCCCT AAAAAgCCAT 4 500 

AAGACTTTGA TAACTTTAAT AAGGGGCaAT 4 56 0 

TTGATGCTGC ATATCAACTT GCAAAAAATG 4 62 0 

CCGGTTTAAA TGATCCGGAT GCTGATCCTA 46 BO 

GACTAGGTAA TGTCATTAAG CAAGGTGCTC 4 74 0 

AAGATATTGA TTTTAACAAT GGACAGTATC 4 8 00 

TTACACCTCA TGAACCAATA CTAGCAACTG 4 8 60 

AACAATTATT TGTGACAACA AATCAAGATA 4 92 0 

GTTATCCGAA TATTTTTATG ATTGGTGCAA 4 980 

ATATCTATAA ATTTAGAG CG CGATTTGCAG 5 04 0 

GcTTACCAGC TAAACAAGAT GTCATTGAAA 5100 

ATTATTCATG TTGTGAAGTG TCATGCACAT 5160 

GCATTATACG CCCATACCTA ATGAACCTCA 52 20 

TACCGCAACA GAAAAAGCGA AATTAAGATT 52 8 0 

ATTGGCAGAA AGTGTAGTTT CTTcGCCTAT 5 34 0 

ACTATTTCAA CATGAGCGAC GACATTTAAG 54 00 

GTTGTTGATA TTTTTAATGT TTGCATTGCC 54 6 0 

GTTTCAAAAT CAGTATGTGT CAGCATGGAT 5 52 0 

TCACGATATA TTACAGCATA TATTATTTGG 5 5 80 
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ATTGATTAGT TTATCAACTG CTATAATTGA TCAAACAGGA CTCAAATCAT GGATGATATG 5 70 0 

GGCAATTGAA CCGTCAATGT TATGGATAGG ATTACAAGGT AATGATATCG TGCCACTATT 576 0 

AGAAGGGTTT GGATGTAATG CAGCAGCTAT TTCACAAGCA GCACACCAAT GCCATACCTG 5 82C 

CACGAAGACA CAGTGTATGA GTTTAATAAG CTTTGGTAGT TCTTGTAGTT ATCAAATAGG 58 8 0 

TGCGACATTA TCTATTTTTA GTGTAGCTGG AAAGTCATGG CTATTTATGC CGTACTTAAT 5 94 0 

10 

ATTAGTACTT TTAGGTGGCA TCTTACATAA AGGATATGGT TGAAAAAGAA TGATCAACAA 600 0 

CTTAGCGTTC CGCTACCTTA TGATAGGCAA TTACATATGC CAAATATA CG TCAAATGTTG 6 06 0 

is CTACAAATGT GGCAAAATAT ACAAATGTTT ATCGTTCAAG CGCTACCTAT TTTTATCACA 6120 

ATCTGTCTTA TTGTTAGTAT TTTATCACTA ACGCCAATTT TGAATGTTTT ATCACAAATA 6180 

TTTACACCTA TATTATCGTT ATTAGGCATC TCGTCAGAAT TGTCACCAGG GATTTTATTT 62 4 0 

20 TCAATGATTC GAAAAGACGG CATGCTCTTG TTTAATTTGC ATCAGGGCGC CTTATTACAA 63 00 

GGAATGACAG CAACACAGTT ACTACTACTT GTGTTTTTTA GTTCAACATT TACAGCGTGC 6 3 60 

TCGGTCACAA TGACGATGCT TTTGAAACAT TTAGGTGGTC AGTCAGCACT AAAATTAATT 64 20 

25 

GGAAAGCAAA TGGTGACATC ATTGTCTTTA GTTATTGGTG TAGGCATCAT TGTTAAAATA 64 80 

GTAATGCTGA TTATTTAAAA AAAATGAACT ATAACTGAAT ATAGAGTCAT GTCAGTCAAT 6 54 0 

AGGAGATCTA TCTTGGAATA TGCTATTCAT ATGAAGTATA AGAGGAGAGT CGCAGATGAA 6 600 

30 

AATAGTTATT ATAGGTGGGT TTTTAGGTGG CGGTAAAACG ACTGTCTTAA AT C ATTTG C T 66 6 0 

CGCTGAATCA TTAAAGGAAT CGCTGAAACC AGCAGTCATC ATGAATGAAT TTGGGAAAAT 6 72 0 

35 GAGTGTTGAT GGTGCCTTAG TATCTGAAGA CATACCTTTA AGTGAACTGA CAGAGGGGTG 678 0 

TATCTGTTGT GCAATGAAAG CAGATGTATC AGAACAGTTA CATCAATTAT ATTTAAAAGA 6 84 0 

GCAACCAGAC ATTGTATTTA TTGAATGTAG TGGGATTGCA GAACCGGTCT CTGTCTTAGA 6 90 0 

40 TGCTTGTTTA ACGCCTATTT TAGCTCCGTT TACAACAATT ACACATATGA TTGGTGTAAT 6 96 0 

AGACGCAAGC ATGTATAAAC ACATTAAATC ATTCCCTAAA GACATCCAAG GCTTATTTTA 7 02 0 

TGAGCAATTA GCATATTGTT CTGTCTTATT TGTTAATAAA ATAGATTCAG CAGATGTTGA 7 08 0 

45 

AACAACGAGC AAACTATTGA AAGATTTAGA AGTTATTAAC CCAGAGGCCG ATATACAAGT 714 0 

CGGTATGCAT GGCAGCGTCA CTTTGCCAAT ATCAGTTAGA CAAATGACAG CAACTTCTGA 72 0 0 

50 CAATAAACAT AAGTCTTTAC ATCAAATGAT TAATCATCAA TTTGTGCAAT CACCAGTCAA 72 6 0 

ATGTACTAAA GCAGAGTTTA TAAAACGTTT AGCATGCCTT CCGTCTCATA TTTATAGGTT 732 0 

GAAAGGGTTT ATGACATTTG AAGACACCGC ACATACGTAT CTCATTCAAT TTACACAAGG 73 8 0 
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CGGAAAGGGT ATTTCAAAAG AAGACTATCA ATGTTTGGAA CAGTAGTGTT TTCAGTGGAA 7500 

GAGAATGGTT AACATGCCTT CATGTATAAT AACGAGTTGA TTTGAACGTT TAAGCGTAAA 7 56 0 

TAAAAATAAG CTTGGTCAGC CATCAAATAT AATTTGAAAA CTGTCCAAGC TGTTTTATTA 76 2 0 

GAGAACAATC AATTAACCCC ACATATTTAA TAATACATCA GCAAAGCCTT CAGGTTTTTG 76 3 0 

AATATAACCT AAGTGACCGC CTGGAATATC TACAATAGGT ATGCCAGTTT CTTTATTTAT 774 0 

ATAAAAGTTA ACATCTTGTG GGAAGGAGCC TCTAGAATCT GTCCCATTTA GTAGGGTGAT 7 30 0 

TTT AT CGCTG TATTTTGTGA AATCATCCAA AGTAATATCT GAATGCGTAT ATTGTCTAAT 7B6 0 

, 5 TTCAAATTCT GACCAGAACA TCGTACGTTT GTACTGTTCT ATACGTCCTT CTTCAGTATC 7 92 0 

AGCAGGTTGA GACATCATTT TTGCATCAAT TGGTGCGATA TTTAATGTTT CGCCAAATGT 7 980 

TTTCATGCCT TTTTCTAAGC CTTCTGTTAA AATTTGATGC A CAATGTC AT CATTTTTATC 8 04 0 

20 TTTCCAATAA GTACTGTCTG GTAAAAATGT ATTAATTGGT GGTTCGTGAA ATGCAATCTT 8100 

TTTAACGACT TCAGGGTAAT CTTTTAACAC ATGCATCGCA ACGATTGAAC CTGAACTTGA 816 0 

ACCTAATATA TAGACAGGTT CATCACTTAA TGACTTTGCA AGTTCGGCAA TGTCCTGTGC 822 0 

25 

GTCGCGTTTG ACACGATAAT CACTGTCAGG GTTTGAAGCG GAATCAGGGA GTGGTTCAGT 82 8 0 

TAACTCGCTT TCTCCATAAT CACGACGATC AACGGCTACA ACAGTAAAAT GGTCTTTTAA 834 0 

CTGTTCTGCA AGAGGCAGAA AAATGTCTCC GGTACCGTTT GCACCAGGAA TAAAGATGAG 84 0 0 

30 

CACGGGTCCT TGTCCGACTT GGTGGTATCG TAATTTAGCG CCTTGTAATT CTAAAGTTTC 84 6 0 

CATATTCAAT GACCTCCATT TGTTAATTGT TAGGTGATAA ACCTAATAAT TTAGCACCAT 8 52 0 

3S TTGTATAACT TATTTTCTCT TTTTCTTCAT CTGTTAAACC CAGTTCATCT AAAAATACAC 858 0 

CTAATTTTTC AGGCTCAATA TATGGATAAT CAGCAGCATA AAGAATTCTA TCAATACCTA 864 0 

CTTCTTTCTT GACTAAATCA AACTGTGGCT TCGTTAACAT GCCACTCGGT GTGATATAAA 870 0 

AATTATTTTT AAAGTAATAG CTTACAGGGT GGTTCAAATG TTCAGCGAAT AAAG CTT CAT 8 76 0 

CCATACGTTC TAAGAAGAAT GGGATAAACT CACCCCAATG TCCAATAATC ATATTTAACT 8 82 0 

TTGGATAACG ATCAAAAATA CCAGATAATA CTAGATGTAT TGTATGAATG CCGACATCAA 8 8 80 

TGTGCCAACC ATAACCAAAA CAAGCAAATG TTGCCGCAGT TACTTCAGGA TAATTTCCTT 8 94 0 

TATAGTATGA TTGATAAATG TCACTGTTAA CTGGCGCGGG ATGTAGATAA ATCGGTACGT 900 0 

CTAAATTTTC AGCTGTTTTG AAAATAATGT CATATTTGTC TTGATCAAGA AAA C CAT CTT 906 0 

GTGCACGTCC CATAATGAGC GCACCTTTGA ATCCTAAATC ATTGATGCAA CGTTCGAATT 912 0 

CTCGCGCTGC GGCTTCAGGC TCATTGATAG GTAAAGTTGC AAAGCCTACA AAGCGATTGG 918 0 
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TCTGACCAAC CAAATTTGAA GGAGAACCAT TTCCATAAGA TAAGACTTGA ATTTGAACGT 93 3 3 

CTTGATTATT CATAAATTGG ATACGTTCAT CATGATGTGA TAATT CGTCG GCATTTGTAA 93 6 0 

5 

AACCTGTCTT TTTTTcAAGG CCTTCTAACA TTACTTTCAT CGGTACACCT TTAGGATCTG 94 2 3 

CTGATATCGC ATTCATCGTT TCTTTTTGAA TATCTTCAAT GACATAATGT TCTTCAAACG 9 4 3 3 

TAATACTTTT CATTTACTTC GCCTCCATAT TGTATTGCAT GTTTATTGCA TCTATTGCAG 9 54 3 

w 

AAGCATTTTT TATATACCTC TAATTTCAAT GTTTGTAACA TAAAATTGAT CTACCAAGGC 960 3 

ATCTCTCCAT CGCCATTAAT AAATGTACCT GTTGGGCCAT CTG CACCAAT CGTTGCTAAT 96 6 0 

is TGAATGATTG GCTTGATTCC TTCAGAAACG TGTTTGGAAT T ATT ACT AAA ATCACCAACT 972 0 

AAATCAGTAT TTGTAGCGCC TGGATCAGCA GCATTGATTT GCATGTTAGG TAATCCTTTA 978 0 

GCGTATTGTA GCGTTAGCAT TGTTACTGCC GATTTAGACG AACAATAAGC TAATGAATTC 984 0 

20 

ACTTTAGATT CAGCTGTTTC GGGGTTTGTA ACCATTCCAA ATGAACCTAA ACCACTTGAT 99 0 0 

ACGTTGACGA CAACAGGTTG TT CAG ATT TT TCTAAGAGAG GGACGAATGT ATTCATCATT 996 0 

CGTACGATAC CGAATACATT CGTTTGATAT ACTTCTTCAA CGTCACGAGG TGTCAATTTG 10020 

25 

GAAGGTGCTG AAAATTGACC AGATATACCT GCATTGTTAA TGAGGATATC AAGACGGCCT 10 080 

TCTTTTTCAG CAATCATGTT ATAAGCATTT TTGACTGAGT AGTCACTTGT AACATCTAAT 1014 0 

TGTACATAAT GAACACCTAA TTTTTGTGAT GCTTGTTGTC CTCTTACATC ATT C CGAGAA 10200 

30 

CCTATATAAA CTTTGTAACC CAATGCTTTA AGTGCCTCTG CACTTGCATA GCCTAACCCT 10260 

TTATTGCCTC CTGTGATTAA CACAATTTTA GTCATTACGT CCCACCTCAT CTAAATAAAT 10520 

35 GTTTAATAAA TAATTTCTGT ACGCTTCAAT TGAAATATGG CGATGCTCTA TTTGGAAGGC 103 80 

AAATACACTA GTTGATAATG ATTGCAACAG CATATCTGTT TTGAAtTCGT GTAAGTGTCG 10440 

TCATCGCTTT TAAATAAGTC ATAATAAAAA TCAAATAATT CTTGATAAAA TGCGCTTTGG 10 500 

40 TAAAAACGTA ATTTATTGTT GCCTGCTTCA ATACATTGCA GTAGTGCCTT ATTATCGATT 10 560 

TTAAATTGTA AAAGATAATC TAACGACACT TGCATAACCT CATAATTAGA ATGATAGTCA 1062 0 

TCTTTAATTT GCTTAAAATG AGTGATAAAA ATATCAAGGT CTCTTTGTAT GACGTAGTAG 10680 

45 

CATAAATCGC TTTTATCTTT GAAATGTCGA TACAATGTCC CCATACCGAT ACCTAGTTCT 10 740 

TTAGCAATAC GATTCATACT AATGTTTTCA ACGCCTTCTT CATCAAAAAG TTTGTGCGCT 10800 

50 ATTTCTTCAA TTCGTTGCCT ATTCTCTTTT GCATCTTTTC GCATGATTAC ACCTACTTAA 1086 0 

AATTCTCTAA AATTGACAAA CGGATAACTC TCCGTTTATT ATAAAACGTG TTAAGAAAGT 10 920 

TAGCAATGAA TTTGCAATAA CTATTAAATA TCATAAAAGA AAAGAGTGTT GATAATGTCT 1098 0 
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ACCTTATCGG TTCAAATGA7 TGCTGAAAAA CTGAATGTCA CTACAGAAGA TGTGGAAAAA 11100 

GTATTAGCTA TGACAGCGCC ACTAGGCATT TTTAGTCATC AATTACAACG ATTTATTCAT 11160 

TTAGTATGGG ATGTCAGAGA TGTAATAAAC GACAATATTA AAGGAAATGG ACAAACACCA 11220 

GAACCATATA CGTATTTAAA AGGTGAAAAA GAGGACTATT GGTTTTTAAG A 11271 
(2) INFORMATION FOR SEQ ID NO: 12: 

10 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 6261 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

1S (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

20 CAACCCGTTC AGAACAAAAT AAAAACCGTA CAATTTTATC ATCTTAATGA TTATTGTACG 6 0 

GAAAAACTTT TTTACATCAT ATCTGCATGT GCATAATCGA TATCGGTAAA TTTATTATAT 120 

TGTTTCATAA AATGTAACTT AACTGTGCCT GTTGGACCGT TACGTTGCTT AGCAATGATA 180 

25 

ATTTCAATTT CACCGTTTTC ATCATTCGTT TGTGGCTCGA AACCACCATC ATCGTCATCA 24 0 

TCTTCATCGC CGCCACGGTT ATAGTAATCA TCACGGTATA AGAATGCAAC GATATCGGCA 3 00 

TCTTGCTCAA TCGAACCAGA TTCACGAATA TCACTCATCA TTGGACGTTT ATCTTGTCGT 360 

30 

TGTTCAACAC CACGAGATAA CTGACTTAAT GCGATAACTG GACATTTTAA TTCACGGGCT 420 

AATGCTTTTA ATGTACGAGA GATTTCAGAA ACTTCCTGTT GTCTGTTATC GGACGCACGT 4 80 

35 GAACCACTAC CTTGAATCAA CTGTAAGTAG TCAATCACAA TCATGTCTAA GCCATGTTCT 54 0 

TGCTTTAATC GACGACATTT AGAACGTAAA TCATTAATTC GAATACCCGG TGTATCATCA 6 00 

ATAAAAATCT TCGTACGTGA TAATTTACCT ACCGCTATAG TAAAACGACT CCAATCTTCC 6 60 

TCAGTCATAG TACCCGTTCT TAAGCGGTTT GAGTCAACAT TTCCAGAACT ACAAATCATA 720 

CGTGTGGCTA ACTGATCAGC ACCCATCTCT AGCGAGAAAA TACCAACTGT ATACATATCT 730 

TCATGCGTTG CAACTTTTTG TGCAATATTA AGTGCGAACG CAGTCTTACC TACAGATGGA 84 0 

CGCGCTGCAA GGATAATTAA ATCATTT CGG TTGAACCCTG CTGTCATTTG GTCTAAATCT 900 

CGATATCCTG TAGGTATACC TGGTGTTTGA CCACTATTTT GATCAAGCTC TTCAGCTGTT 960 

50 TCATACACTT GTCCTAAGAC GTCTCGAATG TCTTTAAAGC CATCGCTTTC ACGAGAAGAT 102 0 

GATAGCTCTA AAATTCGACG TTCTGCATCA CTTAAAATCG CATCTAGTTC AAGTTCATCA 10 3 0 

TTATATCCAT CATTGGCAAT ACTATCTGCA GTTTGAATCA ATCTACGTTT TAATGCATGC 114 0 
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7CTGCAAGAT ATTGCSGGCC ACCCGCTTcA TTCAACGTAC CTTCCGTCGA TAATTGATCC 12 6 0 

ATCAATGTTA CAACATCAAT TTCTTTATTA TCTTCA7TTA AGTGCATCAT TG CACGG AAA 1320 

ATATGTTGAT GGGCACCCCT ATAAAACGAC TCAGGAAGCA AAACTTCCTG AGTAGTATTA 13 80 

ATCAATTCTG GATCTATAAT AATTGAACCT AAGACAGACT GTTCAGCTTC ATTGTTATGC 14 4 0 

GG CATTTGAT TTTGCTCATA CATTCTATCC ATGAATGGTT ACACCT CTT A TTTCAATCCA 1500 

w 

ACTTTATTGT TCAACTGTGT GTACGCGAAT TGTACCTTCA ACTTCTTTAT CTAATTTAAC 1560 

AGGTACATTC GTATATCCTA GGGAATGAAT TCCATTTGGT AAATCCATTT TACGTTTATC 1620 

,5 AATTTTAATA TCATGTTGTG CTTTTAGTGC TTCGGCAATT TGTTTTGTAC TTACTGACCC 16 60 

AAACAATTTA CCACCTTCAC CAGTTTTTGC TGaTACTTCA ACTTCAATGT TTGATAACGT 174 0 

TTCTTTTAAT GCTTTAgCAT CTTCAATTTC TTGTTGGCGT TCTTGTTTTG CACGTTTTTT 18 00 

20 CTGTAACTCT AATTGTTTAA GGTTACCTGG TG7TGCTTCT ACAGCATAAT TCTTTTTCAA 186 0 

TAAGAAGTTA TTTGCATAAC CTACTGGTAC TTCTTTAACT TCACCTTTTT TACCTTTACC 192 0 

TTTACCTTTA ACATCTTGTG TAAAAATTAC TTTCATGCAT CTTCACTCCT ACTTAATTGT 1980 

25 

TCTGTAATTG CTTGTTGTAA TTGTGCTATC GCCTCTTCGA CTGTCACACC TTTAAGTTGT 2C4 0 

GTTGCCGCAT TGGTTAAATG TCCACCGCCA CCAAGTGCTT CCATTGTTAA CTGGACATTT 2100 

ACTGAACCGA GTGAACGCGC AGATATACCA ATCAGATTAT CTTCACGTCT CGCAACAACA 2160 

30 

TATGATGCTT CAATACCTTC TAAACTTAAC AGTTCATCTG CTGCTTGTGC AACTGTTACT 2220 

GGATGATAAA TTTTATCGTC TGAACCATGC GcAATGGCTA TGCCATTATC TTCAACTTTT 22 8 0 

35 ACAGTTCGAA TTAATTCAGA TCGATTAATG TAAGTATCCA CATCATCTTT TAAGAAATGT 2 34 0 

TGCGTTAAAA TCGTATCTGC ACCATGTGCA CGTAAATAAC TCGCTGCATC GAATGTTCTT 24 0 0 

GATCCTGTTC GTAATGTAAA GTTTCTTGTA TCTACAATAA TACCTGCATA CATCACTGTT 24 60 

40 GATTCAAGAC GTGTTAAACG TTGTTCTGTT GGTTGATATT CCAGTAACTC TGTTACCAAT 2 520 

TCAGCTGTCG AACTTGCGTA TGGTTCCATA TATATCAACA ATGGATTAGA GATGAAGCTT 2 580 

TCACCACGTC TATGATGATC GATAACAACT TTACGGTTTG CTTTATTTAA GACATTTTCA 264 0 

45 

TCTAAAACCA GTTCCGGTTT ATGCGTATCA ACAATCACTA CGGTTGTCTT AGATGTCATC 2700 

ATATCCCAAG CATCATCTGA TGTAATAAAT CGCTCTCTTA ACTCTGGCTT TTTATCTATT 2760 

50 TCGTTCATCA CGCGTCGTAA TGTTGGATCA ATGTCAGTCT CATTTAATAC GATGTATGCT 2 82 0 

T CTAAATT AT TCATCATTGC AAATCTAGAC ACACCGATTG CTGCACCAAT TGCATCTAAG 2 880 

TCAGGACGTT TATGTCCCAT GATAATGACT TTGTCACCCT CTGCAAGGAT ATCTTTTAAC 2 94 0 
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CCATAGAAAC GCACATTACC ATTAATACTT 
AATGCTAAGT CTAGGCCTGA TTGTGATAAT 
TCACCAACAC CGATACTTAA TGTTAATTGG 
TGACTCAAGA TATCAAATTT AGATTCTTCT 
GCTACGAATT GATCGGAACT GTATCTTTTG 

w 

CTAATGACAC GCGTTACCAT TGAGTTGATT 
GTAATCTCAT CGTAGTTATC TAAAAATAAT 

;5 AGTTCATTTG TTTGTACTTG TTCAGTTATA 

GAATAACGTA CTTGGAAATG ATACTGATTA 
AATTGCTTTA AAATGTTTGG AAATACTTCA 

20 ATATGATCTG TCATAAATTG GTTAACCCAT 
ATACCAATTG GTAAATGTTT GATTGCTTTA 
CCATCTACAT AACTATCCAT TTTCATTAAA 

25 ATCATCACGA CAAGAACGAT AGATGCAATT 
ACACCCATTA AAACAATTGC TGTGATGATC 
TTAGTGGACT GCCGATTCAT TATTCCACCT 

30 

TTCGCTTCAA ATTCAAACTT AAATCGATAA 
GTGTCAGTAT TGTACCGATA A C CAAT AGT A 

3g CTTTACCAAA GAAATGAATA ACACTTAAAC 
GTTGGAAGTT TAAAAGAATG CTCTGGAACA 
TGATAACAAT AATGTATATC CATAATAAAA 

40 TAAATACAGG TGTAGCGATT TTAAATTTTC 
TTAAGACGAT TAAAAATGTA ATGATAATGA 
TAAACCC7TC TTCTAATATT TGGGTCATAT 

45 CATGTAATGT TTGCTTGAAA GGTTTTACTA 
TTTGTAGTAA CATAAAAGCG ATTAATGAAA 
ATATTCTTTC TTTAGACGTT CTTTCTTTGA 

50 

AGACTAATAT GATGGCACTT AAAACGAAAG 
TAATAAGTGC ACTAATCCCG AAAGATTGTA 
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TTAATTGCAA CTTGGTCGCC ACCGCGTCCT 306 0 

TCACCTAAGT CGATTAAATT TTCAGTACCT 312 0 

GCACGATAAC CAACACCTTT TTCACGTAAT 3180 

AAGTCAGCTA ATATTTTTTG ATTTAAATAG 3 24 0 

AAAAATATAT TATACTCAGT TGCCCATCGA 3 3 00 

TCCGAACGCT GCGTATCATT CATATTTTGC 3 36 0 

GTCGCAATGA TTGGTTTAGA ATTTTCATAT 34 2 0 

TCAAAGAAAT AGAGGCAGTG ATCATTCTCA 34 8 0 

TATTCTATTT cAACGGATTT CACTCTATCT 3 54 0 

TTTACAGATT CAGAAATGAC ATTCGCTTCC 3 600 

TCGATGTGAT CATTTTCATC TAAAACAATG 3 660 

TTATTTGTTG TTGAAATTTG AGCACTCAAA 3 72 0 

GCTTGTCTGA ATAAAATGAT GCTAACAATA 3 780 

AGTGCTATAA G ACT ATT AAA GATAAACCAT 3 84 0 

ATGATGACAA ATGGTATTAG TAAAGCTTTC 3 900 

CTATTCACTT TTTAGAATTA TTTTTCATGA 3 96 0 

CACCAAGTAG TCCTACAATA TGTGTCGTAG 4 020 

AAATCGTTAC TGCATTCGGC AAACCTTTCG 4 080 

CTTGAATATA CATTACTAAT GATAACACAA 414 0 

CACTCGGTTG ACCTGTAAAT AATAAACATA 4 200 

TACCGCTCAT TTGCCACGCG AAAAGTGGCT 4 260 

GTAAAATCGG AAATGTAACG ATTAAGTTAA 4 32 0 

TGAAACCTGG TAATTGAACG GTCGCTTGTC 4 380 

TCGCATCGGC ACCGCTCATC GTAATCGCTT 44 4 0 

TGCTCGCTGA TGGTGGAATC CTTCCGAATG 4 500 

TTnArCTCAT CGCTACTGTT GTTACGTATA 4 56 0 

GCAATTGACC AATAATTAAA CTTGCAATTA 4 62 0 

TATTACCTAA AACAGTTGTT ATAATTACTG 4 6 80 

TTGATTTATT CCATAAAACG ATACCTGGTA 4 74 0 
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CAAATACCAA CGCAATCGTT GCAATTATTG TTG CTTTAGG TTGTATTTTT G AAAA CA CAT 4 960 

AAGCCACTCC CATATTTTTA ACTATAGCTA TTATTTTAAC CTCTTTAATG AAAATTAACA 4 920 

ATTTATAGAT TGTATGCTTC TATTTCATTT AATTGAATAA TAACTTTCAT GTTTTATAAG 4 930 

TAATTAACAT ACTCATTTGA ATCGCTTTTG TGTGCTTTCA TTTTCAACAT GATTATTTAA 504 0 

TCCCACTACA TAGCAATCAA GCTTGATTTA GATTTACAAT ACAT7TCCAC TCTCATGTAC 5100 

TCTAGATGTT TTTGAATATG ATAACTGTGA TTTAGTGGCT TCATTCTTTG AAAATATATA 5150 

TTATTACTTA CGCTTAAAAT GCTTTAAATT TAAGAAATGA TATAAGTTAG GTGCCCAGGT 5220 

ACTAAAGTTT AGTAGGaATC CATCA7GCCC AACATTATCA GGCACGAAGA AATGACGATG 52 8 0 

ATATTTAAAA CGTTCACCTA ATGCACGAAC TTGATCATCC GGATATAGCA AATCATCTAT 5 34 0 

GAACCCCATC GTTAACACTT TTGTTTCTAA ATTTTTAAAA ACATGCGTTA CGTCTGTGCG 54 00 

ACCTCGGTCA ATGTTGTGAC TATCCAATAC ATCTAGCAGT GTCAGATAAC AATTCAAATC 54 6 0 

AAAATGTTCT TTAAATTTAT TACCTTGATG TTGTTGGTAT GCGACTACTT CATCCGGCGT 552 0 

AAAACGTTCA TCATAACTTT TTGATGATCG ATATGTCAAA AAACCTAATT GGCGTGCAAT 5 53 0 

ACTTAGACCT TCCTTACCAC CAAGATGAAT GGCTTGCCTT GCAATTTCAT TGAAAGCTCT 56 4 0 

ACTATAAGAT GATGTTCGAC TTGTTGCAGC AAGGATAATG GCTTTATCTA CTTCAAACTG 57 00 

TTGATTGTAG AGTAGTTCCA TTGCTTGCAT ACCTCCAAGA CTTCCCCCTA TTAAAATATT 57 6 0 

AATCTTATCA TAACCAAGGG CTTGT AT AC C TCGTTCATTC GCTCTGACTA TATCTCTTAA 5 82 0 

TGTTAATTTT TTAGGAAAAT GAGGGTCGTT TAAAGGTGAA CTTGAACCGA AAGGACTACC 5 8 80 

AATAACATCA AATGTTAAAA ATTGATAATC GTGAATGGGT ATATATCCCC CAT CAATAAT 5 94 0 

TTCTCGCCAC CAACCCGGAT AATCATCTGT TCCATATGTT AAATGATTGC CAGTTAATGC 6 0 00 

ATGACAAACT ACAACTAATG GTTGTCCATG ATAACCGACA TGCTCATATC TCAAACGCAA 6 06 0 

40 GTnATCTATG ACTTCCCCAG ATTCTGTAAT AAATTCCCCT AAATTTAAAG TATCTACTGT 612 0 

GTAATTTGTC ATTGTTCTTT CCTCCTTAAA CAAAAAAACT TCTCACCCTA TTGAAAAGTA 618 0 

AGAAGTCTTT ATACTTATCA TTCGAGTAAC TCGTTGGTTT TAGCACCGTG CTATAAAGTC 624 0 

45 GGTTGCTGAA GTATCACAGG G 62 61 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 1222 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

55 



35 



253 



15 



EP0 786 519 A2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATGCGATTAA CTCTGGAAAT ATCTTTTCCA TATTTACGTn TTAAATTATT CAGCAAATTC 6 0 

ATACGAGaTT CATACTCGTT yAACACTTGT TCG7CGAATT CTGTATTAGC CATTT CAT CA 12 0 

TATAACTCAT GTTTTGCATC TTCTAAAATG TAGTAAAATT GATCAATATC TTCTTTTAAT 180 

TTGTCATATT TGTTTGGAAC TATATCGTTT ATTGTTAACA AATGGTTGCT TAGTTCATAT 24 0 

AAACGATCAG TGATAGCATT TTCATCCGTT AATGTCATAT ATGCGTTATT AAGCGCTAAG 3 00 

CTTAATTTTT CAGAGTTTTG AATGCGTTTA ATATCTATTT CAAGTTGCTC TATTTCGCCT 36 0 

TCTTTTAGAT GTGCTTCAGA CAATTCTTCT AATTGGAATT TCATTAAATC TAAACGCTGT 42 0 

AGCAATGCTT GGTCTGCTGA TTCTAAATCT TCTAACTCTT GCTTTTTGGC TTTATAATTT 4 30 

TGAAAAGTTT GGTGATATTT ATCCAACAAA TCTTGATAAC GTGATTCTGC GTAATTATCC 54 0 

20 AATAATGTTA AATGGTATTT TTGTTTCAAC AAAGACTGCG TTTCATGTTG GCCATGAATA 6 00 

TCTAATAATT CTTGCATAAC TTTTCGTAAA TCTTGTAAAG TAACTGTTTG ATTATTAATT 66 0 

TTACAAAGAC TTTTAC CAGA GCTGAAAATT TCCCGTTTAA CTAATAAAAA ATCTTCATCT 72 0 

ACATCAATAT CCATATTTTT CAATATATGT ATAGCATCTT TACTCTCGTC AATATCAAAT 78 0 

ATACCTTCGA TGACAGCCTT TTTTTCACCA TGTCTTACAA AATCAGATGA AGCTCTCATT 84 0 

CCAATTAATT GTCCAATTGC ATCTATAATA ATTGACTTAC CTGAACCCGT TTCACCACTT 90 0 

AAAACAGTTA AACCATCAGA AAATTGAATT TCTAATTCTT CAATAATAGC AAATTGCTTG 96 0 

ATTGATAAGG TTTGTAACAT AAACTCATCG CAT C CTTAT A ACAAATTGAA AATTCTTGAC 102 0 

TTGATTTCAT CACTTGCCTC TTTGCTTCGA CAAATAATTA AACAAGTATC ATCACCACAA 10 80 

ATTGTGCCTA GTACTTCTTC CCAATTGATT TGG7CTAATA TAGCTCCAAT AGATTGTGCA 114 0 

TTAC6AGGTA TGTTTTTAGA ACAAGTAAAT TATCAGTACC ATCTATATTA ACAAAGGAAT 120 0 

40 CCATTAAATA ACGTCCCAAT TT 1222 
(2) INFORMATION FOR SEQ ID NO: 14: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO; 14: 
TTTGTTATTA TTACnTnAAA TAATTGCATT ACTTTTTACT GATGGTACAA CTTTCCATCC 6 0 
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TTCTTTTGGC ACGACATAAT TGTCTTTATC TTGAACTAAA TATCCGCCAG ATACTGAAAC 180 

AAACTCTTCT TCGTTACTGT CTATAGTCAT ATCAATTTCT AATAATCTTA CATTCTTCTT 24 0 

5 TTGTTTTAAA ATATCTAATG CTTCATCTGT AAATTTTGGT GCAATAATGA CTTCCAAAAA 3 00 

GATACTATGC AATTGCTCTG CTAACTCAGG T G TV A C AG CT CGGTTTAATG CAACAATTCC 3 60 

ACCAAATATT GATTGACTAT CCGCTTCATA CGCATGTTGA AATGCTTGTT CTATCGTGTC 4 20 

w 

ACCGATACCA ACACCACATG GATTCATGTG TTTAACCGCA ACTGTAGCAG GTGTATCAAA 4 80 

CTTTTTAACT AAAGCTAGTG TAGCATCTGC ATCTTTAATA TTGTTATAGC TTAATTGTTT 54 0 

J5 CCCATGTAAT TGTTTAGCGC CTGCAATCGT GTG CTT AG CA TTCGAAGTTC TCACAAAATA 6 00 

CGCTG ATTG T TGTGGATTTT CTCCATATCT TAAAGTTTCT TTATCCCCTT TAAAGAAACG 6 60 

TACAATCGCT T CAT CAT ATT CTGCAGTATG CTCAAAAACT TTAATCATTA ATGATTGTCT 7 20 

20 ATATGACTCA TCTAACGAAT CGTTTCTTAA TCGCGTCAAT ACTTCTTGAT AATCTGCCGG 7 80 

ATGTACAATT GTTGTTACAT GTTTATAGTT TTTAGCTGCA GCACGTAACA TTGTTGGACC 84 0 

AC CAATAT CA ATATTTTCAA TTGCTTCGTC CATCGTCACA TCAGGGTTTG CAACAGTTTG 900 

TTGGAATGGA TATAAATTAA CTACTACCAT ATCAATTAAA TCTATATGTT GTTCTGATAA 960 

TTCATTTAAA TGCTGCGGTT TATTTCGATC AG CTAAAATG CCACCATGAA CAGCCGGATG 102 0 

T 1021 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3759 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCATTCACTC CTAAATTGTT ATTACACTAT TACACaTAGC TAATCATCAA TGTGAAATCA 6 0 

CCTTCAAAGA CACTATCCAA ATCTTCAGAA GTCAAAATAA AGTTTGTACC AGTAGTCAGT 120 

45 TTGAAAATTT CACCATCGAC AATCATTTGC CCTTCGCCTT CCAACACTGT AACTAAACAG 18 0 

AACTCTCTAG G CTT CA TATA ATTTAACGTG CCAGAAATTT CCCATTTAAC CAATGTAAAG 24 0 

AAATCATTCG ATACAATGTG TGTACACTTA TGGTTTTCAA TAATTTCGCT TTCAGGCAAA 3 00 

ATATTAGGTA ATGGTGCATT GTACTGAATA ACGTCTAAAG CTTTTTCAAT ATTTAACGGT 360 

CTATCATTAT ATT GAT TAT C TTGACGATTG AAATCATAAA GTCTATATGT AATGTCTGAC 420 
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ATAAAAtAGa ATTCyCCAGG kTTTACtTTA AtatATCyAA gTAtCGaCtC CATCGTTCCG 54 0 

TGTTGAACAT GATTCGCAAC TTCTTCTCTA GACTCTGCTA ATGTCCCtAT AACTATTTCT 6 00 

" GCATCTTCTT CTGCATCTAT AATATACCAA CATTCAGATT TGCCATATTG CCCgTTTTCA 66 0 

TGCTCATAAG CATAAGAATT ATCAGGGTGC ACATGAATAG AAAGTGATTC TCTTGCATCC 72 0 

ACTATTTTAG TTAGAAGCGG AAAATCTTTG CTTGGGAAAT CACCAAACAA TTCACGATGT 7 80 

w 

TCTGACCAAA TACGGTCTAA TGTTTGACCT TGATATGGTC CATTAATAAT CTCGCTCGTA 84 0 

CCATTTGGAT GTGCTGACAC ACACCAACAT TCCCCCAGTT GTATCATTGT CTAATTGATA 9 00 

TCCAAACTCA CTTAGACGTT GACCGCCCCA TAATTTTGTT TTTAAAATTG GTTGTAAAAA 960 

TAATGGCATT GTTGCACCTC CATTGTGATT AAGTAAGCAA TAGAACTCTG ATGTTGTTGT 102 0 

TCCATTATAT TTTGATTTTG TTCTCATTTA CATCGTATTA TTAACTTCCA CATTTCAAAT 10 80 

20 TAACTATTAG TGATTGTACC ATATTTACTA ACATTGCAGT ACTGCCAATT AAAAGnGCTT 114 0 

CACTTAAATT TACAGTACTT TAACATTTTC AAAAATTTAT AGCATAGAGA TTATATCTCT 1200 

CTTACATTTG TACATATTTC CCTTTAAATT TACTCGCCCA TTATACCAAT TAATAaACAA 126 0 

CTTTAATAGT TGTGCCATAC ATTGTTCAAA TTCTTTGTAA AACGCATAGA CAATACGTAC 13 2 0 

TT ATT CAT AC TTATAATTCA TCATTTTCAA AAAATAACGA GTTACGAAAA AGTAACCCGC 13 80 

TTCAAATCAT ATTTACTATC CTTATTAATC CGTTTCATTT TCAAATTGAG TTAAAGCATC 14 4 0 

TTTAATGTCC TGATCACCAC TAATAATTTG AAACTCTTGG TGATTAAAAT GATTGGATGT 150 0 

GACAATTTCT TTTAATACTG TCGCAACATC TTCTCTAGGA ATTTCACCTT TACCATCAAA 156 0 

ATATTGTGCA GCTTCTATCT TTCCAGATCC TGCTGCATTT GTAAGTGCCC CTGGATGTAA 162 0 

AATTGTATAA TTCAAACCTG nAACGTCTTA AATAGTCATC AGCGTAATGT TTAGCTATTG 1680 

TATATGGCTT TAAATCACCG CTATCATCAA AAGCCTGACG TCTCGAATCA TATGTTGAAA 174 0 

40 C CATGAC AT A GTGTTTAATA TTGGCCTCTT TACTCGCAAT CATTGATTTA ACAGCACCAT 180 0 

CTAAATCGAC AATAATTGTT TTATCTGCAC CCGTGTTCCC TCCAGAACCT ACTGAAAAGA 186 0 

TAACTTTATC GAATGGTTTA AACGTCTCAG TTAAAGTCTC TATTGAATCA TTTTCAACAT 192 0 

45 CAACAAGAAT TGCTTTCATA CCTTGTGATT TTAACGCATT AAGTTGATCT GATTGCCTAA 198 0 

CACCAGCAGT AAATGGTACA TTTTCTTTTG CTAATTGTTG CACTAGTAAC GAACCTACAC 2 04 0 

CGCCATTAGC ACCTATAACC AAAATATTCA TTTACAACAC TCTCCTATkT ATT ATT CT C T 210 0 

SO 

ATGCCATACC ACTTTATGAG ATATGTAAAA CTTGTTACAA CTATAAAAAT CAATTGACAT 216 0 

ACTACTGGGA ACGTATTAAA TTAATATATG AACAAATATT CATATGAAAG GATTGTCATA 22 2 0 
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tCaAGGCATT AGcGATTACA ATCGAATACG TATCaTGGAA TTGTTATCaG TCAGCGAAgC 234 0 

AAGTGTTGGT CACATTtCAC ATCAATTGAA TTTATCTCAA TCAAATGTCT CGCACCAATT 24 0 0 

5 

AAAATTACTT AAAAGTGTGC ATCTTGTGAA AGCAAAACGA CAAGGCCAAT CAATGATTTA 24 6 0 

TTCATTAGAT GACATCCACG TAGCAACTAT GTTAAAG CAA GC CAT A CATC ACGCGAA7CA 2 52 0 

TCCTAAAGAA AGTGGGTTAT AATATGTCTC ATTCACATCA TCATCATGAC CATATGCATA 25 8 0 

w 

GTCATGTAAC TACAAATAAT AAGAAAGTAT TGTTTATATC GTTTTTAATA ATCGGTCTAT 264 0 

ATATGTTTAT CGAAATCATC GGCGGTCTCC TTGCTAACAG CTTGGCATTA CTATCTGACG 2700 

, 5 GTATCCATAT GTTTAGCGAC ACATTCTCAT TAGGTGTTGC ACTTGTCGCA TTTATTTATG 2760 

CTGAAAAGAA TGCCACAACT ACAAAAACAT TTGGTTATAA ACGTTTCGAA GTACTCGCAG 282 0 

CGTTATTTAA CGGTGTAACG CTTTTTGTAA TAAGTATTTT GATTGTTTTT GAAGCGATTA 2880 

20 AACGTTTCTT TGTTCCTTCT GAAGTTCAAT CAAAAGAAAT GTTAATCATT AGTATTATCG 294 0 

GTTTAATTGT CAATATCGTT GTTGCATTCT TTATGTTTAA AGGCGGCGAC ACTTCACACA 3 000 

ATTTAAATAT GCGTGGTGCT TTTCTACATG TTATCGGAGA CTTATTAGGT TCAGTTGGCG 3 0 60 

" 5 CCATTACTGC AGCTAkTTTA ATTTGGGCAT TTGGATGGAC AATCGCCGAT CCTATCGCAA 312 0 

GTATTTTAGT TTCCGTTATT ATTTTAAAAA GTGCTTGGGG TATCACAAAA TCTTCAATTA 3180 

ACATTTTAAT GGaAGGCACA CCAAGTGATG TTGATATAGA TGAAGTTATA ACTACTATTA 3 24 0 

30 

AAAAGGATTC ACGAATACAA AGTGTGCATG ATTGCCATGT TTGGACAATT TCAAATGATA 3 3 00 

TGAATGCATT GAGTTGTCAT GTTGTTGTAG ACCATACATT GACAATGAAA GAATGTGAAT 3360 

T ATT ATT AG A AAa CATTG AG CATGATTTAT TACATTTAAA TATTCACCAT ATGACTATTC 34 2 0 

35 

AATTAG AAA C GCCTAATCAC AAACATGATG AATCGATTAT ATGTTCAGGA ACACATAGTC 3480 

ATTCACATAA CCATCATGCT CATCATCACG CGCATGTACA TTAATAATTT TAACCTACTG 3 54 0 

40 C CATTG CATC GATTAAACTT TTCAATGGCA GTAGGTTTTT TATGTCTTTA TGGCGACTTG 36 00 

TTTGGTCTTT GATGATGCAA TGTTTATTAA CAAATTTTCA ACTATTATTT CTTACATTAG 36 60 

TCATATTTTT GACAATTTAC TATTATAATT CTCTAACTTT AGTCACTTTA ATTAATTTTT 3 72 0 

45 ATTAGATATT AATATGAAAA TAACGTGTTT TTTGTTATT 3 7 59 

(2) INFORMATION FOR SEQ ID NO: 16: 

fi) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 13086 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TAATTAT CGC GCATAACAAA ACATTAGCAG G A CAATT AT A TAGTGAGTTT AAAGAATTTT 60 

5 TTCCTGAAAA CAGGGTGGAA TACTTTGTAA GTtACTATGA TTATTATCAn CCAGAGGCAT 120 

ACGTACCGTC TACTGACACT TTTATTGAAA nAGATGCCTC AATCAnTGAT GAAATTGATC 130 

AACTACGACA TTCTGCTACA AGTGCATTAT TTGAACGCGA TGATGTAATT ATTATTGCTA 24 0 

10 

GTGTAAGTTG TATATATGGT TTAGGTAATC CTGAAGAATA TAAAGATTTA GTAGTAAGTG 3 00 

TTCGAGTTGG TATGGAAATG GATAGAAGTG AATTACTTAG AAAACTTGTc AGATGTGCAA 360 

TATACACGAA ATGACATCgA TTTcCAACGA GGAACGTTTC GAGTGCGTGG TGATGTAGTG 42 0 

15 

GAAATATTCC CAGCCTCTAA AGAAGAACTT TGTATAAGGG TTGAGTTTTT CGGCGATGAG 4 80 

ATTGACCGTA TCCGAGAAGT TAACTACCTA ACAGGTGAAG TGTTGAAAGA AAGAGAACAT 54 0 

2Q TTTGCGATAT TCCCAGCTTC TCACTTCGTA ACACGTGAAG AAAAGTTGAA AGTTGCGATT 60 0 

GAACGTATTG AAAAAGAATT GGAAGAACGA TTGAAAGAAT TACGAGATGA GAATAAATTA 660 

CTAGAAG CGC AAAGG TT AG A ACAGCGTACC AACTATGATT TAGAAATGAT GCGAGAGATG 72 0 

25 GGATTCTGTT CAGGAATTGA AAACTATTCC GTACATTTAA CTTTGCGACC ACTGGGTTCG 780 

ACACCATATA CTTTATTGGA TTACTTTGGC GATGATTGGT TAGTAATGAT TGATGAATCA 84 0 

CATGTGACAT TACCGCAAGT TCGAGGCATG TATAACGGAG ACAGAGCGCG T AAA CAAG TT 900 

30 TTGGTGGATC ATGGGTTTAG ATTAC CGAGT GCATTAGATA ACCGTCCACT TAAATTTGAA 96 0 

GAATTTGAAG mAAAGACAAA ACAACTTGTG TATGTATCTG CAACGCCTGG ACCATACGAA 102 0 

ATTGAACATA CGGATAAGAT GGTTGAACAA ATTATTCGTC CTACTGGTTT ACTGGATCCT 108 0 

35 

AAGATTGAGG TTAGACCTAC TGAAAATCAA ATTGACGATT TATTAAGTGA AA TT C AAA CA 114 0 

AGAGTgAGCG TAATGAACGC GTACTTGTTA CAACGCTCAC TAAAAAGATG AGTGAAGATT 12 00 

aACCACATAC ATGAAAGAaG CGGGTATTAA aGTtAATTAT CTGCATTCAG AAA T CAAG AC 12 6 0 

40 

ATTAGAACGA ATTGAAATAA TTAGAGACTT A CG AATGGGT ACATATGATG TTATCGTAGG 1320 

TATTAATTTA TTAAGAGAGG GTATTGATAT ACCAGAAGTT TCTCTAGTTG TCATATTAGA 13 80 

45 TGCAGATAAA GAAGGGTTTT TACGTTCTAA CCGCTCATTA ATTCAAaCAA TAGGTAGAgC 144 0 

TGCGCGTAAC GATAAaGGTG AAGT CATT AT GTATGCCGAT AAAATGACTG ATTCGATGAA 1500 

GTATG CAATT GATGAGACAC AACGTCGTCG AGAAATACAG ATGAAACATA ATGAAAAACA 156 0 

50 TGGTATTACA CCTAAAACAA TTAATAAAAA AATACATGAT TTAATTAGTG CTACTGTTGA 162 0 

AAATGACGAA AATAATGACA AAGCACAAAC TGTGATACCT AAGAAGATGA CGAAAAAAGA 1680 
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TTT C GAG AAA GCTACAGAAT TAAGAGATAT GTTA7TTGAA TTAAAAGCAG AAGGGTGACA 1800 

AGTAAATGAA AGAACCATCC ATAGTAGTAA AAGGTGCTCG TGCGCATAAC TTGAAAGATA I86 0 

5 TTGATATCGA ACTACCTAAA AaTAAATTAA TTGTTATGAC AGGTTTATCT GGGTCAGGTA 1920 

AATCGTCATT AGCATTCGAT ACTATATATG CTGAAGGACA ACGACGTTAT GTTGAATCAT 198 0 

TAAGTGCCTA TGCGCGTCAA TTTTTAGGCC AAATGGACAA ACCAGATGTT GATACAATTG 2 04 0 

10 

AAGGATTATC GCCAGCAATT TCAATAGATC AAAAAACAAC AAGTAAAAAT CCAAGATCAA 2100 

CTGTA3CAAC AGTAACAGAA ATATATGATT AT AT A C G TTT GTTATATGCA CGTGTTGGTA 2160 

AACCTTACTG TCCAAATCAC AATATAGAAA TTGAATCGCA AACAGTACAA CAAATGGTTG 2 220 

IS 

ACCGCATTAT GGAATTAGAG GCACGTACAA AG ATT CAATT ATTAGCACCT GTCATCGCTC 2280 

ATCGTAAAGG TAGTCATGAA AAG CTAATCG AAGATATTGG TAAAAAAGGT TATGTACGTT 2 34 0 

20 TAAGAATCGA TGGCGAAATT GTTGATGTAA ATGATGTACC TACTTTAGAT AAGAACAAGA 24 00 

ATCATACAAT AGAAGTTGTT GTAGACCGAT TAGTTGTTAA AGATGGAATT GAAACACGAC 24 6 0 

TAGCTGACTC TAT AG AAA CT GCCTTAGAGC TTTCAGAAGG ACAATTAACA GTCGATGTCA 2520 

25 TTGACGGGGA AGACCTTAAG TTTTCAGAAA GCCATGCTTG TCCTATATGT GGATTTTCAA 2 58 0 

TCGGAGAGTT AGAACCAAGA ATGTTTAGCT TTAACAGTCC TTTTGGTGCT TGTCCGACAT 2 64 0 

GTGATGGCTT AGGCCAAAAG TTAACAGTCG ATGTAGACTT GGTTGTTCCC GACAAAGATA 2 7 00 

30 AGACGCTAAA CGAAGGTGCA ATAGAACCTT GGATACCGAC GAGTTCTGAT TTTTATCCAA 2 76 0 

CATTGTTAAA ACGTGTTTGT G AAG TTT AT A AAATCAATAT GGATAAACCT TTTAAAAAGT 2 82 0 

TAACAGAACG TCAACGTGAT ATTTTATTGT ATGGTTCTGG TGACAAAGAA ATTGAATTTA 2 8 80 

35 

CATTTACACA ACGTCAAGGT GGTACTAGAA AACGAACAAT GGTTTTCGAG GGTGTAGTTC 2 94 0 

CTAATATAAG TAGACGATTC CATGAATCTC CTTCAGAATA TACACGTGAA ATGATGAGTA 3 00 0 

AATATATGAC TGAACTACCT TGCGAAACTT GTCATGGAAA GCGATTGAGT CGTGAAGCkT 3 06 0 

40 

TATCTGTTTA TGTAGGTGGT TTAAATATTG GTGAAGTAGT CGAATATTCA ATCAGTCAAG 3 12 0 

CGCTGAACTA TTATAAAAAC ATTGATTTGT CAGAACAAGA TCAAGCGATT GCAAATCAAA 3180 

45 TATTGAAAGA AATTATTTCC CGACTCACTT TTTTAAATAA TGTGGGACTT GAATATTTAA 3 24 0 

CGTTAAACAG AGCTTCAGGT ACACTTTGAG GTGGTGAAGC ACAACGTATT CGATTAGCAA 33 00 

CGCAAATTGG GTCGCGTTTG ACTGGTGTCT TATATGTATT AGATGAGCCA TCAATTGGAC 33 60 

SO TGCATCAAAG AGATAATGAT CGATTAATTA ATACACTTAA AGAAATGAGA GATTTAGGAA 34 2 0 

ATACTTTAAT TGTAGTTGAA CACGATGATG ATACAATGCG TGCGGCTGAT TACTTAGTGG 34 8 0 
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AGGTAATGAA AGATAAAAAA TCATTAACAG GACAATACTT GAGTGGTAAG AAACGTATTG 3 6 00 

AAGTACCTGA AT AT CG GAGA CCGGCTTCAG ATCGTAAAAT TTCTATACGT GG AG CTAGAA 3 6 60 

5 GCAACAATCT TAAAGGGGTT GATGTGGACA TACCACTATC AATCATGACG GTTGTTACAG 3 72 0 

GTGTATCAGG TTCTGGTAAA AGCTCATTAG TAAATGAAGT ATTATACAAA TCATTAGCTC 3 7 80 

AAAAAATTAA TAAATCTAAA GTAAAGCCAG GATTGTACGA TAAGATTGAA GGTATTGATC 3 84 0 

w 

AACTTGATAA AATTATTGAT ATTGATCAAT CACCAATAGG TAGAACGCCA CGCTCTAATC 3 900 

CAGCAACATA TACTGGTGTG TTTGATGATA TACGTGATGT GTTTGCGCAA ACAAATGAAG 3 960 

CTAAAATTCG AGGATATCAA AAAGGGCGTT TTAGTTTTAA TGTAAAAGGT GGACGCTGTG 4 020 

75 

AAgcTTGTAA AGGTGACGGT ATTATTAAAA TTGAAATGCA TTTTTTACCT GATGTTTATG 4 080 

TTCCTTGTGA AGTGTGTGAT GGTAAACGAT AT AATCG TG A GACACTAGAG GTTACTTACA 414 0 

2Q AAGGTAAAAA TATTGCTGAC ATTTTAGAAA TGACTGTTGA AGAAGCAACA CAATTTTTTG 4 2 00 

AAAATATTCC TAAGATTAAG CGCAAGTTAC AAACACTAGT TGATGTTGGT CTTGGATACG 4 2 60 

TCACATTAGG TCAACAAGCT ACAACGTTAT CAGGTGGTGA GGCTCAACGT GTGAaACTTG 4 3 20 

25 CATCTGAACT TCATAAACGT TCAACTGGTA AATCTATTTA TATCCTAGAT GAACCGACAA 4 3 80 

CAGGGTTACA TGTTGACGAT ATTAGTAGAT TATTAAAAGT ATTAAACCGA TTAGTTGAAA 44 4 0 

ATGGTGATAC TGTTGTAATT ATTGAACATA ACCTAGATGT TATCAAAACA GCAGACTATA 4 500 

30 TTATAGACTT AGGTCCTGAA GGTGGTAGTG GCGGTGGTAC TATTGTTGCG ACTGGCACAC 4 56 0 

CCGAAGATAT TGCTCAGACA AAGTCATCAT ATACAGGAAA GTATTTAAAA GAAGTACTTG 4 620 

AACGAGATAA ACAAAATACT GAAGATAAAT AAGATTAAAA GAAGTGAAGG ATGTTATAAA 4 6 80 

35 

TTTATCCTTC GCTTCTTTTT ATTAATTTAG TAATGAATAG TAGAAAGAAA AGATGCGTAA 4 74 0 

AAAGAATTAT GTTAAGATAG GGTCAATCTA GAGTAGTTAA ACATAAATCG AACTGGGAGT 4 8 00 

GGGACAGAAA TGATAAAGAA TCACTAATGA TTTATTATGT AGTGGTTCTT TGTCATT AG C 4 8 60 

40 

CACAGCTATT GTGTACTTAA AAATAGGaat GCaTgAGTGC AACTCATGCA TAAGaAATAC 4 9 20 

TAATTTCTAA AGAAAAAGTA TTTCTTTATG TTGGGGCCCC GCCAACTTGC ATTGTTTGTA 4 980 

45 GAATTTCTTT TCGAAATTCT TTATGTTGGG GCCCCGCCAA CTTGCATTGT TTGTAGAATT 5 04 0 

TCTTTTCGAA ATTCTTTATG TTGGGGCCCC GCCAACTAAT TCCAATATAT CATTGTAGAG 5100 

CTTAGGTCAT TGATTTTTGG CTCGGACTTT TATGGCGATA TGAACCATGT AAATTAAGCA 516 0 

50 AGCAATAAAT TAATGATTGA TATTGACTTG TAAAATAATA ACAATAATGA ACAATTAATA 5 22 0 

TTTATTTTAG CTTTTCAATG TAGATTGGTG TTATATTTTT GATATGATAA GAAGAGATGT 52 8 0 
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ACATTAAAGT TAGATTTAAT CGCTGGTGAA GAAGGACTAT CGAAGCCAAT TAAAAATGCT 54 0 0 

GATATATCAA GACCGGGCTT AGAGATGGCA GGTTATTTTT CACATTATGC GTCAGATAGA 54 6 0 

5 ATACAACTAT TAGGAACAAC GGAACTATCG TTTTACAATT T ATT AC GAG A TAAGGATCGC 5 52 0 

GCAGGTCGTA TGCGTAAACT ATGCAGACCA GAAACGCCTG CAATTATTGT GACACGTGGA 5 580 

TTGCAGCCAC CAGAAGAATT AGTTGAAGCT GCAAAAGAAT TAAATACCCC ACTTATAGTT 5 64 0 

w 

GCTAAAGATG CGACTACAAG TTTAATGAGT CGCTTAACAA CGTTTTTAGA GCATGCACTT 570 0 

GCAAAGACGA CATCTTTACA TGGTGTTTTA GTAGATGTTT ACGGTGTTGG TGTACTAATT 576 0 

ACCGGTGATT CAGGAATAGG TAAAAGTGAG ACTGCGTTGG AATTAGTTAA ACGTGGGCAT 5 82 0 

15 

AGATTAGTAG CAGATGATAA TGTAGAAATA CGTCAAATTA ATAAAGATGA ACTAATAGGG 5880 

AAACCACCAA AGTTAATAGA ACATCTATTA GAAATACGTG GACTAGGTAT TATCAATGTT 594 0 

2Q ATGACTTTAT TTGGCGCGGG TTCAATATTA ACTGAAAAAC GAATTAGATT AAATATTAAT 6 000 

TTGGAAAACT GGAACAAGCA AAAG TT AT AT GACCGCGTAG GTCTTAATGA AGAGACGCTA 60 6 0 

AGTATTTTAG ATACTGAAAT CACTAAAAAA ACAATACCTG TAAGACCTGG TAGAAATGTT 612 0 

25 GCGGTAATTA TTGAGGTCGC TGCAATGAAC TATCGATTAA ATATCATGGG CATTAACACG 618 0 

GCCGAAGAAT TTAGTGAAAG ATTAAATGAA GAAATTATCA AGAACAGTCA TAAGAGTGAG 6240 

GAGTAGGTTG AATGGGTATT GTATTTAACT ATATAGATCC TGTGGCATTT AACTTAGGAC 63 0 0 

30 CACTGAGTGT ACGATGGTAT GGAATTATCA TTGCTGTCGG AATATTACTT GGTTACTTTG 63 6 0 

TTgCACAACG TGCACTAGTT AAAGCAGGAT T A CAT AAAG A TACTTTAGTA GATATTATTT 64 2 0 

TTTATAGTGC A CT ATTTGG A TTTATCGCGG CACGAATCTA TTTTGTGATT TTCCAATGGC 64 8 0 

35 

CATATTACGC GGAAAATCCA AGTGAAATTA TTAAAATATG G CAT GGTGG A ATAGCAATAC 6 54 0 

ATGGTGGTTT AATAGGTGGC TTTATTGCTG GTGTTATTGT ATGTAAAGTG AAAAATTTAA 660 0 

ACCCATTTCA AATTGGTGAT ATCGTTGCGC CAAGTATAAT TTTAGCGCAA GGAATTGGAC 666 0 

40 

GCTGGGGTAA CTTTATGAAT CACGAGGCAC ATGGTGGATC GGTGTCACGC GCTTTTTTAG 6 72 0 

AACAATTACA TTTGCCTAAT TTTATAATAG AAAATATGTA TATTAACGGC CAATATTATC 6 780 

4S ATCCAACATT CTTATATGAA TCCATTTGGG ATGTCGCTGG ATTTATTATC TTAGTTAATA 6 84 0 

TTCGTAAACA TTTAAAATTA GGAGAAACAT TCTTTTTATA TTTAACTTGG TATTCAATTG 6 90 0 

GTCGATTCTT TATAGAAGGA TTACGTACAG ATAGCTTAAT GCTCACAAGT AATATTAGAG 6 960 

50 TTGCACAATT AGTATCAATT CTTTTAATTT TAATAAGTAT AAGTTTAATT GTATATAGAA 702 0 

GGATTAAGTA TAATCCACCG TTGTATAGCA AAGTTGGGGC GCTTCCATGG CCAACAAAAA 7 080 
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TTATGGCGTG TATACCGTCT TGTTAAATTT TCGAAAGTTT TTAAGAATGT AATTATCATT 7200 

GAATTTTCGA AATTTATTCC AAGTATGGTA CTGAAAAGAC AT AT AT AT AA ACAACTTTTA 726 0 

5 AATATTAATA TCGGTAATCA ATCGTCGATA GCTTATAAAG TAATGTTAGA TATTTTTTAC 7320 

CCAGAACTGA TTACGATTGG TAGTAACAGT GTTATTGGTT ACAATGTAAC AATTTTGACG 73 8 0 

CATGAAGCAT TAGTTGATGA ATTTCGTTAT GGACCAGTGA CGATAGGATC TAACACTTTG 74 4 0 

w 

ATTGGTGCAA ATGCTACCAT TTTACCCGGT ATAACGATTG GTGACAATGT AAAAGTTGCA 75 0 0 

GCTGGTACGG TTGTTTCAAA AGATATACCG GATAATGGAT TTGCATATGG CAACCCTATG 756 0 

TATA T AAAAA TGATTAGGAG GTGACAATTT TATGGCGCAA AAGAATAATA ATGTAATTCC 7 62 0 

15 

AATGACTTTT GATGATGCAT TTTATCGTAA AATGGCTAAA CAGAAGTTTA AACAAAGAGA 76 30 

ATATAAACGA GCTGCTGAAT ACTTTGAAAA AGTGTTAGAA TTGTCACCTG ATGATCTGGA 7 74 0 

2Q AATTCAAATT GATTATGCAC AATGTCTAGT GCAACTTGGT ATTGCTAAAA AAGCAGAACA 7 8 00 

TTTATTTTAT GACAATATTA TTTATAATAG GCATCTAGAA GATAG CTTTT ATGAATTGAG 7 86 0 

TCAGCTCAAC ATTGAAGTTA ACGAACCAAA CAAGGCATTC TTGTTTGGTA TTAATTATGT 792 0 

25 TATTGTTAGC GACGACCAAG ATTATAGAGA TGAATTAGAT CAAATGTTTG ATGTGAAATA 79 8 0 

TCAAAGTGAA GAACAAATTG AACTTGAAGC TCAATTGTTT GTAGTTCAAA TACT ATT CCA 804 0 

AT AT CTTTTT TCTCAAGGTC GATTAAAAGA TG CAAAGAAT TATGTCTTAC ATCAACCACA 8100 

30 AGAAGTTCAA GATCATCGTG TAGTACGTAA TTTATTGGCA ATGTGTTATT TATATCTCGG 816 0 

TGAATATGAT ACgGCTAAAG CATTGTACGA aGCACtATTA CAAGAGGATA GTACaGATAT 8 220 

ATATGCATTA TG C C ATT AT A CTTTGCTACT TTATAACACT AAGGAAAATG AACAATATCA 82 8 0 

35 

AAAATATTTA AAAATATTAA ACAAAGTTGT ACCTATGAAT GACGATGAAA GTTTTAAATT 834 0 

AGGTATTGTA TTAAGTTATT TAAAGCAGTA TCGTGCATCA CAACAATTGT TGTACCCTTT 84 00 

ATATAAAAAA GGGAAATTTT TATCAATTCA AATGTACAAT GCTTTAGCAT ATAATTATTA 84 6 0 

40 

TTATTTAGGT GAAGAAGACG AAAGT CATT A CTACTGGGAT AAATTGAAGC AAATTTCTAA 8 52 0 

AGTGGAAATT GGACATGCGC CTTGGGTAAT TGAAAATAGC AAAGAAGTTT TTGACCAACA 8 58 0 

45 TATTTTGCCA TTACTTCAAA GTGATGACAG TCATTATCGT TTATATGGTA TTTTTTTATT 864 0 

GGATCAATTA AATGGTAAAG AAATTGTGAT GACGG AAAGT ATTTGGCAGG TTTTGGAAAA 8700 

TCTAAATAAT TATGAGAAAT TGTATTTAAC GTATTTAGTT CAAGGTTTAA CGCTCAATAA 8 760 

50 ATTAGACTTC ATTCATCGCG GCTTATTAAC GCTTTACCAT AATGAATTAT TTGTAAGTGA 9 8 20 

AAATGATGTA ATGGTTGCAT GGATTAATCA AGGTGAACTC ATAATTGCTG AAAAAGTAGA 8 8 80 
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TCGAAACGT7 ACAAAGAAGC AAATTACAAC 
CAAAATGATT GAATTTCTCT T GAG CAT AT A 

5 TGCGCATAAT GATTAATAAT GAGGAGGCGT 

AGCAATTATC GGTGCAGGTC CAGCTGGTAT 
TTTAAAAACA GTTATGATTG AAAGAGGTAT 

W AGTAGAGAAC TTCCCTGGTT TCGAAATGAT 
TGAACACGCT AAAAAGTTTG GTGCAGTTTA 
TAAAGGCGAA TATAAAGTGA TTAACTTTGG 

15 

TATTGCTACA GGTGCAGAAT ACAAGAAAAT 
ACGCGGTGTA AGTTATTGTG CAGTATGTGA 
CGTTATCGGT GGTGGTGATT CAGCAGTAGA 

20 

CAAAGTAACA ATCGTTCACC GTCGTGATGA 
AGCATTCAAA AATGATAAAA TCGACTTTAT 

2s AAAAGACGGC AAAGTGGGTT CTGTGACATT 
ACACGAGGCT GATGGTGTAT T CATC TAT AT 
AG ACTTAGG T ATT AC AAATG ATGTTGGTTA 

30 AGTACCAGGT ATTT7TGCAG CAGGAGATGT 
TGCTACTGGC GATGGTAGTA TTGCAGCGCA 
CGATCAAGCT TAATTCGAAG TCGAATTAAG 

35 TTTTAATAGT GTCATCACAG CGTTAAAATA 
ATAGSAAACT AGAACTTAGT ACGTATCATT 
TATGTTATAT TAAACTTATA ACTTTATGGG 

40 

TGATTTATTA TGTAGTGGTT CTTAAA CATT 
AATACATGAG TAAAACTCAT GCATAAGAAA 
ATCGTTGTCC CACCCCAACT TGCACATTAT 

45 

TGGGGCCCCG CCAACTTGCA CATTATTGTA 
GCCCCGCCAA CTTGCACATT ATTGTAAGCT 
50 CGCCAACTTG CATTGTCTGT AGAAATTGGG 
CCCAACTCGC ATTGCCTGTA GAATTT CTTT 
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ATGGTTAGGC ATAACACAAT ATAAACTGAA 900 0 

GA7TTATGAA AAGTTAGATT TATTATATAA 9060 

TAATAAAATG ACTGAAATAG ATTTTGATAT 912 0 

GACTGCTGCA GTATACGCAT CACGTGCTAA 918 0 

TCCAGGCGGT CAAATGGCTA ATACAGAAGA 924 0 

TACAGGTCCA GATTTATCTA CAAAAATGTT 93 0 0 

TCAATATGGA GATATTAAAT CTGTAGAAGA 936 0 

TAATAAAGAA TTAACAGCGA AAGCGGTTAT 94 20 

TGGTGTTCCG GGTGAACAAG AACTTGGTGG 94 80 

TGGTGCATTC TTTAAAAATA AACGCCTATT 954 0 

AGAGGGAACA TTCTTAACTA AATTTGCTGA 96 0 0 

GTTACGTGCA CAGCGTATTT TACAAGATAG 96 60 

TTGGAGTCAT ACTTTGAAAT CAATTAATGA 9720 

AACGTCTACA AAAGATGGTT CAGAAGAAAC 97 80 

TGGTATGAAA C CATT AACAG CGCCATTTAA 984 0 

TATTGTAACA AAAGATGATA TGACAACATC 9900 

TCGCGACAAA GGTTTACGCC AAATTGTCAC 9960 

AAGTGCAGCG GAATATATTG AACATTTAAA 10 020 

ATGTTGAGCT GTAAATTATT TGGATATTTA 10080 

ATGTCTTACT TTTAAATTAA AGCAAATTAT 1014 0 

TGTGCGTTTC AATGAGTTCT AGTTTTTTTA 10200 

AGTGGGACAG AAATGATAAA GAGCCACTAA 10260 

AGCCACAGCT AATGTGTACT TAAAAATAGG 103 20 

TACTAATTTC TATAGAAAAA GTATTACTTT 103 80 

TGTAAGCTGA CTTTCCGCCA GCTTCTGTGT 10440 

AGCTGACTTT TCGTCAgCTT CTGTGTTGGG 10 500 

GACTTTTCGT CAGCTTCTGT GTTGGGGCCC 10560 

AATCCAATTT CTCTATGTTG GGGCCCACAC 10 62 0 

TCGAAATTCT CTGTGTTGGG GCCCACACCC 106 8 0 
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15 



ACTCGCATTG CCTGTAGAAT TTCTTTTCGA AATTCTCTGT GTTGGGGCCC CTGACTAGAG 108 00 

TTGAAAAAAG CTTGTTGCAA GCG CATTTTC ATTCAGTCAA CTACTAGCAA TATAATATTA 10 860 

TAGACCCTAG GACATTGATT TATGTCCCAA GCTCCTTTTA AATGATGTAT ATTTTTAGAA 10 920 

ATTTAATCTA GACATAGTTG GAAATAAATA TAAAACATCG TTGCTTAATT TTGTCATAGA 10 9 30 

ACATTTAAAT TAACATCATG AAATTCGTTT TGGCGGTGAA AAAATAATGG ATAATAATGA 11040 

AAAAGAAAAA AGTAAAAGTG AACTATTAGT TGTAACAGGT TTATCTGGCG CAGGTAAA7C 1110 0 

TTTGGTTATT CAATGTTTAG AAGACATGGG ATATTTTTGT GTAGATAATC TACCACCAGT 11160 

GTTATTGCCT AAATTTGTAG AGTTGATGGA ACAAGGAAAT CCATCCTTAA GAAAAGTGGC 112 2 0 

AATTGCAATT GATTTAAGAG GTAAGGAACT ATTTAATTCA TTAGTTGCAG TAGTGGATAA 112 80 

AGTCAAAAGT GAAAGTGACG T CAT CATTG A TGTTATGTTT TTAGAAGCAA GTACTGAAAA 113 4 0 

2Q ATTAATTTCA AGATATAAGG AAACGCGTCG TGCACATCCT TTGATGGAAC AAGGTAAAAG 114 00 

ATCGTTAATC AATGCAATTA ATGATGAGCG AGAGCATTTG TCTCAAATTA GAAGTATAGC 114 60 

TAATTTTGTT ATAGATACTA CAAAGTTATC ACCTAAAGAA TTAAAAGAAC GCATTCGTCG 11520 

25 ATACTATGAA GATGAAGAGT TTGAAACTTT TACAATTAAT GTCACAAGTT TCGGTTTTAA 11580 

ACATGGGATT CAGATGGATG CAGATTTAGT ATTTGATGTA CGATTTTTAC CAAATCCATA 11640 

TTATGTAGTA GATTTAAGAC CTTTAACAGG ATTAGATAAA GACGTTTATA ATTATGTTAT 1170 0 

J ° GAAATGGAAA GAGACGGAGA TTTTCTTTGA AAAATTAACT GATTTGTTAG ATTTTATGAT 11760 

ACCCGGGTAT AAAAAAGAAG GGAAATCTCA ATTAGTAATT GCCATCGGTT GTACGGGTGG 11820 

ACAACATCGA TCTGTAGCAT TAGCAGAACG ACTAGGTAAT TATCTAAATG AAGTATTTGA 11880 

ATATAATGTT TATGTGCATC AT AGGG A CGC ACATATTGAA AGTGGCGAGA AAAAATGAGA 11940 

CAAATAAAAG TTGTACTTAT CGGTGGTGGC ACTGGCTTAT CAGTTATGGC TAGGGGATTA 12 0 00 

AGAGAATTCC CAATTGATAT TACGGCGATT GTAACAGTTG CTGATAATGG TGGGAGTACA 12060 

GGGAAAATCa GAGATGAAAT GG AT AT A CCA GCACCAGGAG A CAT CAG AAA TGTGATTGCA 1212 0 

GCTTTAAGTG ATTCTGAGTC AGTTTTAAGC CAACTTTTTC AGTATCGCTT TGAAGAAAAT 12180 

CAAATTAGCG GTCACTCATT AGGTAATTTA TTAATCGCAG GTATGACTAA TATTACGAAT 12 240 

GATTTCGGAC ATGCCATTAA AGCATTAAGT AAAATTTTAA ATATTAAAGG TAGAGTCATT 12 3 00 

CCATCTACAA ATACAAGTGT GCAATTAAAT GCTGTTATGG AAGATGGAGA AATTGTTTTT 123 60 

GGAGAAACAA ATATTCCTAA AAAACATAAA AAAATTGATC GTGTGTTTTT AGAACCTAAC 124 20 

GATGTGCAAC CAATGGAAGA AGCAATCGAT GCTTTAAGGG AAGCAGATTT AATCGTTCTT 12480 
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GCGTTAATTC ATTCTGATGC GCCTAAGCTA TATGTTTCTA ATGTGATGAC GCAACCTGGG 126 00 

GAAACAGATG GTTATAGCGT GAAAGATyAT ATCGATGCGA TTCATAGACA AGCTGGACAA 12660 

5 CCG TTTAT7G ATTATGTCAT TTGTAGTACA CAAACTTTCA ATGCTCAAGT TTT3AAAAAA 12720 

TATGAAGAAA AACATTCTAA ACCAGTTGAA GTTAATAAGG CTGAACTTGA AAAAGAAAGC 127 3 0 

ATAAATGTAA AAACATCTTC AAATTTAGTT GAAATTTCTG AAAATCATTT AGTAAGACAT 12 840 

w 

AATACTAAAG TGTTATCGAC AATGATTTAT GACATAGCTT TAG AATT AAT TAG TACT ATT 12900 

CCTTTCGTAC CAAGTGATAA ACGTnAATAA TATAGAACGT AATCATATTA TGATATGATA 1296 0 

ATAGAGCTGT GAAAAAAATG AAnATAGACA GTGGTTCTAA GGTGAATCAT GTTTTAAATA 13 020 

15 

AGAAAGGAAT GACTGTACGA TGAGCTTTGC ATCAGAAATG AAAAATGAAT TAACTAGAAT 13 0 80 

AGACGT 13 0 86 
20 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1350 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 

25 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

30 CATTAGTCAT GAAAATAGCC GACAACTTCA TCTGTGAAAT CACCGGCCTT TTATTTTAGC 6 0 

TAACTTTATT TCTGATTTTA CGATTTTAAT TGATCATACA GAGAAAGTGA TCTTTTTACA 12 0 

ATTTCTAAAA ACTCATGATC TATATTGGAC ATTTGATGAA AATAAGACAA AATGTTTTCT 18 0 

35 

GTTAGCTTCT CTTGTTTTGG GAATGAATCA TCTTCTTTAA TCCAAATCGC TAATTCGCCT 24 0 

AATGGTGTTT TATCATCTTT AAATGTTTGT ATATATTCGT AAAAGCTCAT AGTATTCCTT 30 0 

CTCTCAATTT ACTTATATAA ATCCTACCAC GAAAGCTTTC AAGAAAACAC AATTAAATGT 36 0 

40 

CTATTTAGTG AACTTTTTAA GGTTGTGCAC TCTTTTAATG TCTGCCAATT AGGTCAATTA 42 0 

ATCATCACAA TGTACAATTA ACTCTATTTT CAGTTCATAT ACTCACACAC CGTTTTTGAA 4 80 

45 CAACACATTA ACTTCTCATT TAGATAAAAC GCAAAAAAGC CTGGCACCAA TACAATAGAT 54 0 

GCCAGACTAA GAGTCTACTA TATAAATTTA TTTAGCGTAT GGTTTTACTT CGATTGCACC 6 00 

TTCATTTTCA TCATGAACAC CATGCTTATA ATAATCAATA TATTGTGGCT CTAAAGGCTT 66 0 

50 TCTGCCACGT ATAATGTCTG CTGCTTTTTC AG CTAACATT AAAACAGGTG CGTGTATATT 72 0 

GCCATTTGTC GTACGTGGCA TAGCTGATGC ATCAACTACA CGTAAATTTT CCATACCGTG 730 
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ACTACAAGAT GGGTGTAATG CTGTTTCACC ATCTCTACGA ACCCAATCAA GAATTTCTTC 9C0 

GTCTGTTTGC ACTTCTGGTC CTGGTGAAAT TTCTCCACCA TTGAATGGAT CCATTGCTTT 960 

5 

TTGAGATAAG ATATTTCTTG CTACACGAAT TGCTTCTACC CATTCTTTTT TATCTTCTTC 102 0 

TGTTGATAAA TAATTAAAGC GGATACTTGG TT7TTCGAAT GGATCTTTAG ATTTGATTTT 108 0 

CAAGCTACCA CGAGAGTTTG AATACATTGG TCCTACGTGA ACTTGATAAC CATGTGCGAC 114 0 

10 

CGCTGCCTTT TG AC CATC AT ATCTTACAGC TATTGGTAAG AAATGGAACA TTAAGTTAGG 12 0 0 

ATAAt CAACT TCGTTATTTG AACGTACAAA TCCGCCACCT TCAAAATGGT TAGATGCTGC 12 6 0 

TGCACCTGTA CGTGTGAAAA TCCATTGTAA ACCAATAAAT GGcATGCGCT TGAt ATCTAA 13 2 0 

GCTTGGCtGt AATGATACAG GTTCCTTACA 1350 
(2) INFORMATION FOR SEQ ID NO: 18: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

TAATGCTATT GGCAACACCA TATATGAAAn CTCCAAACGA TCCTAAACCG ACTATAGATT 6 0 

30 

CACCAAATTT nACAATCCAT GAATAAAGTA GTGGCCATAA GAATAACAAT ATGACAACTA 120 

AAAATGTACA GTAAAATGCA GTCATAATTG GAACTAGACG TTTACCACTA AAAAATGATA 180 

ATGCTAATGG TAATTCTGTT TCACTAAACT TATTGTATGC ATAAGCTGCT ATTAAACCTA 24 0 

TTACAATACC AACAAAGACA TTGCCATTAT TCATCTTTTC AAAAGCTGAA TTTATTTCCG 3 00 

ArGCTTTCAT TCCTAATAAA GGCGCTAATT TCATTGGTGA TAATACAACT GTAACTAAAA 36 0 

AATATCCTAA CGTrGCTGCA rGCGsGACTG CAC CATC ATT TTTCTTTGCC ATTCCTATAG 42 0 

CTACACCAAT TGCAAATAAA ATACCTAATT GCTCTAAAAT CGTAGTACCT ACCGTAGTAA 4 80 

AGAACATTGC GATTTTCGGC GTCGCATGAA GTGCATTTAA CGTATTACCA ATT CCGGCAA 54 0 

45 TAATTGCTGC AGCCGGTAAA AT GG CAACT G GTAACATTAA CGAACGCCCT AAATTTTGGA 6 00 

AAAATTTATA CATTGAATGT CATCCTTCTT AAAATAATGT AGAAATATAA AGATTACTAA 660 

TGTAACTAGA ATAACTACTT CGATACTCCG TTATAGTCAC CTAGGCTTAC TAACCAGCTA 720 

TATTTCTACC TCAAGTTATT TTATAAACTT TTTACAATTT CATGCAATTC TTGTTGTAAC 7 80 

TTTGCTGTTC GTGTTTCAAT CTCTTTTGTA ATATAATCGA TACGCTCGTT TCGTTTTAAA 84 0 
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w 



20 



AAAGACCGTG AATCTTAGTA GGACCAACAT AAGCAACAGG TAATATTGGT GACTTACTTA 960 

ACATTGCAAT TGTTGAAGCA CCaCGTTTCA AAGGTGCACC TTCTTGCGAT GTGCGAGAAC 1020 

CTGTTGGGAA GATACCAACT GTCTTATTAT CTTTCAACAA ATTGATTGGG CGTTTTAAAG 10 80 

TACTAGGTCC TGGATTTTCA CGATCTACAG GAAATGCATT TAAAGACGTT AAAAATTTAC 114 0 

CAATCCATTT ATTTTTGAAT AATTCTTTTT TAGCCATATA ATGAATTTGA TTAGGATATA 12 00 

ATGCCATACC TAGCATAATG ACTTCGTTAT AACTTTCATG CGTACAAGTT ACGACATATT 126 0 

TACTATCCTT AGGAATATTA TCTTTACCGA TTACGTATAA TGATTTTGAC ATTTTAACTA 13 20 

AAATGAAATT CAAAATCTTA CTAATCACTG AATACATTGT GCCACCTACT TAACTT 13 76 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 7363 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TTGTCATACC AATATTTTGT AAAAT A TGG A ACACAAGTAA AGTGACGAAA CCAACGATAA 6 0 

AGATTTTGTT AAATTGATCT TCAATTTTCG CAGCTAATCT TATTAGATGG AAGATTAAAA 12 0 

ATAAAAATAT TAAGATCAAT ATGACAGAAC CGATAAAGCC AAGTTCCTCT CCAATCACTG 180 

AAAAGATAAA GTCAGTATGA TTTTCAGGTA TATAAACTTC ACCGTGATTG TATCCTTTAC 24 0 

CTAGTAACTG TCCAGAACCG ATAGCTTTAA GTGATTCAGT TAAATGaTAG CCATCACCAC 3 00 

TACTATATGT ATAGGGGTCA AGCCATGAAT TGATTCGTCC CATTTGATAC AGTTGGaCAC 3 60 

CTAATAAATT TTCAATTAAT GCGGGTGCAT ATAGaATACC TAAAATGACT GTCATTGCAC 420 

CAACaATACC TGTAATAAAG ATAGGTGCTA AGATACGCCA TGTTATACCA CTTACTAACA 4 30 

TCACACCTGC AATAATAGCA GCTAATACTA ATGTAGTTCC TAGGTCATTT TG CAGTAAT A 54 0 

TTAAAATACT TGGTACTAAC GAGACACCAA TAATTTTGAA AAATAATAAC AAATCACTTT 6 00 

GGAATGATTT ATTGAATGTG AATTGATTAT GTCTAGAAAC GACACGCGCT AATGCTAAAA 660 

TTAAAATAAT TTTCATGAAT TCAGATGGCT GAATACTGAT AGGGCCAAAC GTGTACCAAC 720 

TTTTGGCACC ATTGATAATA GGTGTAATAG GTGACTCAGG AATAACGAGC AAGCCTATTA 7 80 

ATAATAGACA GATTAAGAAA TACAATAAAT ATGTATAATG TTTAATCTTT TTAGGTGAAA 840 

TAAACATGAT GATACCTGCA AAAATTGCAC CTAAAATGTA ATAAAAAATT TGTCTGATAC 900 
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TTGCTAAAAC AGCTATAGTG GCTACTAATA 
CGGCTGTTGA CGAGATGAAT AATTCATTGC 
° TCAATTTTAC ATGACTTTTT AAAAATTAGC 

TTCAATTTGA ATTAGGAATA AAATAGAAGG 
ATAGATACAG ACACATAAGT CCTCGTTTTT 

w 

TATTAAGATT CAAAGATGCG AATAAATCAA 
ATATTAAGG7 AGCAAACCCT GATATATCAT 
TTTGTATCGT TTTTGGAGGG AAAAGTGCAG 

15 

ATGTATTAAA TGCAATAGAT AAAGACAAAT 
ATGGTGATTG GAGAAAGCAA AATAATATTA 

20 ATTTAGAAAA TGGAGAGGCG CTTGAGATTT 
AACCATACGA TGCAGTATTC CCA IT ATT AC 
AAGGGCTTTT TGAAGTTTTG GATGTACCAT 

25 GTTCTATGGA CAAACTTGTA ATGAAACAAT 
CTTATATTAG TTTCTTACGT TCTGAATATG 
TAAATGATAA ATTAAATTAC CCAGTCTTTG 

30 GTATCAGTAA ATGTAATAAT GAAGCGGAAC 
TTGACCGTAA GCTTGTTATA GAACAAGGCG 
TAGGAAATGA CTATCCTGAA GCGACATGGC 
ACGATTACAA ATCAAAATAT AAAGATGGTA 
ACGAAGATGT TCAATTAACG CTTAGAAATA 
GTTCTGGTTT AGTCCGTGCT GATTTCTTTG 
AAACAAATGC AATGCCTGGA TTTACGGCTT 
TGGGCTTATC TTATCCAGAA TTGATTACAA 
AGGATAAACA GAAAAATAAA TACAAAATTG 
TACATTAAAG CAAATTCAAT CATGGATTCC 
AGAGATAAAT GGAGTCACAA TTGATTCACG 

50 ATTTAAAGGT GAAAATGTTG ACGGTCATCG 
TGGGGCTGCT TTTTATCAAA GAGGGACACC 

55 



CCCAGTCTAC TTTGCGAAnC aATGCTTATC 102 0 

AAACTCCTTT TATA CTC ACT AATGTTTATA 103 0 

TAGAATATCA CAGTGATATC AGCTATAGAT 114 0 

GAATATTGTT CTGATTATAA ATGAATCAAC 12 00 

AAAATGCAAA ATAGCATTAA AATGTGATAC 1260 

TTAACAATAG GACyAAATCA ATATTAATTT 1320 

TGGAGGAAAA CGAAATGACA AAAGAAAATA 13 80 

AACACGAAGT ATCGATTCTG ACAGCACAAA 144 0 

ATCATGTTGA TATCATTTAT ATTACCAATG 1500 

CAGCTGAAAT TAAATCTACT GATGAGCTTC 1560 

CACAGCTATT GAAAGAAAGT AGTTCAGGAC 1620 

ATGGTCCTAA TGGTGAAGAT GGCACGATTC 16 80 

ATGTAGGAAA TGGTGTATTG TCAGCTGCAA 174 0 

TATTTGAACA TCGAGGGTTA CCACAGTTAC 1800 

AAAAATATGA A CAT AACATT T T AAAATT AG I860 

TTAAACCTGC TAACTTAGGG T CAAGTGT AG 1920 

TTAAAGAAGG TATTAAAGAA GCATTCCAAT 1980 

TTAACGCACG TGAAATTGAA GTAGCAGTTT 2 04 0 

CAGGTGAAGT CGTAAAAGAT GTCGCGTTTT 2100 

AGGTTCAATT ACAAATTCCA GCTGACTTAG 2160 

TGGCATTAGA GGCATTCAAA GCGACAGATT 222 0 

TAACAGAAGA CAACCAAATA TATATTAATG 22 80 

TCAGTATGTA TCCAAAGTTA TGGGAAAATA 2 34 0 

AACTTATCGA GCTTGCTAAA GAACGTCACC 24 00 

ACTAACTGAG GTTGTTATTA TGATTAATGT 24 60 

TTGTGAAATT GAAGATCAAT TTTTAAATCA 2 520 

AGCAATTTCT AAAAATATGT TATTTATACC 2 5 80 

CTTTGTCTCT AAAGCATTAC AAGATGGTGC 2 64 0 

TATAGATGAA AATGTAAGCG GGCCTATTAT 2 700 
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AAACCCTAAA GTAATTGCCG TCACAGGGTC 
TGAAAGTGTA TTGCATACCG AATTTAAAGT 
AATTGGTTTA CCTTTAACTA TTTTGGAATT 
GATGGGGATG TCAGGTTTCC ATGAAATTGA 
TGCAGTTATA ACTAATATTG GTGAGTCACA 

10 

TGCTAAAGCT AAATCTGAAA TTACAATAGG 
TGGCGA7GAA CCATTATTGA AACCACATGT 
TATTGGTGTT GCTACTGATA ATGCATTAGT 

15 

TATTTCATTT ACGATTAATA ATAAAGAACA 
TATGAAAAAT GCGACGATTG CCATTGCGGT 

20 AATCTATCAA AATTTAAAAA ATGTCAGCTT 

AGAAAATGAT ATTACTGTGA TAAATGATGC 
AGCTATTGAT ACACTGAGTA CTTTGACAGG 

2S AGAATTAGGT GAAAATAGCA AAGAAATGCA 

GCATATAGAT GTGTTGTATA CGTTTGGTAA 
GCAACATGT C GAAAAAGCAC AACACTTCAA 
AAACGATTTA AAAGCGCATG ACCGTGTATT 
AGAAGTGGTA AATGCTTTAA TTTCATAGAG 
TGATTTGAAT TAATACTAAA AGATTACAAA 
TTGCCTTTTT CTTTTTATGT TAAATCTATA 
GTACACACTT TATATAGGAA GTAGTTTGAA 
GTATTATAAT GTCTAATTTC ACATGTGTTT 

40 

ATACGTATTT TATAAAAaAT TTTTTATAAT 
CTTATCTAAT GCTAGCTTTT TGACAAAAAT 

45 TATTCGCAAA TTGCTTTATT GCGATTAAAT 

AAATATTAAT GAACTTATAT GCAAAAGTAT 
TATTTTGCAA AATTTTAAAG AACTAGGGAT 

so AATGGGATTT AAAGAGCCGA CACCTATCCA 
AATTGATATC CTTGGGCAAG CTCAAACCGG 

55 



TAATGGTAAA ACAACGACTA AAGATATGAT 2 820 

TAAGAAAACG CAAGGTAATT ACAATAATGA 2 880 

AGATAATGAT ACTGAAATAT CAATATTGGA 2 94 0 

ATTTCTGTCA AACCTCGCTC AACCAGATAT 3 000 

TATGCAAGAT TTAGGTTCGC GCGAGGGGAT 3 060 

TCTAAAAGAT AATGGTACGT TTATATATGA 312 0 

TAAAGAAGTT GAAAATGCAA AATGTATTAG 318 0 

TTGTTCTGTT GATGATAGAG ATACTACAGG 324 0 

TTACGATCTG CCAATATTAG GAAAGCATAA 3300 

TGGTCATGAA TTAGGTTTGA CATATAACAC 336 0 

AACTGGTATG CGTATGGAAC AACATACATT 34 2 0 

CTATAATGCA AGTCCTACAA GTATGAGAGC 34 8 0 

GCGTCGCATT CTAATTTTAG GAGATGTTTT 3 54 0 

TATCGGTGTA GGTAATTATT TAGAAGAAAA 36 CO 

TGAAGCGAAG TATATTTATG ATTCGGGCCA 36 6 0 

TTCTAAAGAC GATATGATAG AAGTTTTAAT 372 0 

AGTTAAAGGA TCACGTGGTA TGAAATTAGA 37 8 0 

ATTAGTCGAG GGACCTTTTA CTTATAAAAA 3 84 0 

GAAGAGGTGG TTTTGTGTGT AAATACAAAA 3 900 

AATTTGAAAC TAAATCAAGG TTAATTCTAT 3 96 0 

TGTTTATATA ATGTTTTACA AAAAGATGTA 4 02 0 

CAGTAAAATT TGTTGTGGAA TGTTAACGAT 40 8 0 

GATTATTCGA ATGATGCGTA ACGCTTACAT 414 0 

ATGACAATCA ATTAATGTGA TTCTAATAAA 42 0 0 

TTTTTTGGTG GTACTATATA GAAGTTGATG 42 6 0 

ATTGAGAAAT AAACAGGTAA AAAGGAGAAT 4 32 0 

TTCGGATAAT ACGGTTCAGT CACTTGAATC 43 8 0 

AAAAGACAGT ATCCCTTATG CGTTACAAGG 44 4 0 

TACAGG T AAA ACAGGAGCAT TCGGTATTCC 4 50 0 
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AGAATTGGCA ATGCAGGTAG CTGAACAATT AAGAGAATTT AGCCGTGGAC AAGGTGTCCA 4 62 0 

AGTTGTTACT GTATTCGGTG GTATGCCTAT CGAACGCCAA ATTAAAGCCT TGAAAAAAGG 4 6 80 

CCCACAAATC GTAGTCGGAA CACCTGGGCG TGTTATCGAC CATTTAAATC GTCGCACATT 4 74 0 

AAAAACGGAC GGAATTCATA CTTTGATTTT AGATGAAGCT GATGAAA7GA TGAATATGGG 4800 

ATTCATCGAT GATATGAGAT TT ATT AT G G A TAAAATT CCA G CAGTACAAC GTCAAACAAT 4 86 0 

GTTGTTCTCA GCTACAATGC CTAAAGCAAT CCAAGCTTTA GTACAACAAT TTATGAAATC 4 92 0 

ACCAAAAATC ATTAAGACAA TGAATAATGA AATGTCTGAT CCACAAATCG AAGAATTCTA 4 980 

TACAATTGTT AAAGAATTAG AGAAATTTGA TACATTTACA AATTTCCTAG ATGTTCATCA 504 0 

ACCTGAATTA GCAATCGTAT TCGGACGTAC AAAACGTCGT GTTGATGAAT TAACAAGTGC 510 0 

TTTGATTTCT AAAGGATATA AAGCTGAAGG TTTACATGGT GATATTACAC AAGCGAAACg 5160 

TTtAGAAGTA TTanAGAAAT TTAAAAATGA CCAAATTAAT ATTTTAGTCG CTACTGATGT 522 0 

AGCAGCaAGA GGACTAGATA TTTCTGGTGT GAGTCATGTT TATAACTTTG ATATACCTCA 528 0 

AGATACTGAA AGCTATACAC ACCGTATTGG TCGTACGGGT CGTGCTGGTA AAGAAGGTAT 534 0 

CGCTGTAACG TTTGTTAATC CAATCGAAAT GG A TT AT AT C AGACAAATTG AAGATGCAAA 54 00 

CGGTAGAAAA ATGAGTGCAy TcGTCCACCA CATCGTAAAG AAGTACTTCA AG CACGTGAA 54 6 0 

GATGACATCA AAGAAAAAGT TGAAAACTGG ATGTCTAAAG AGTCAGAATC ACGCTTGAAA 5 52 0 

CGCATTTCTA CAGAGTTGTT AAATGAATAT AACGATGTTG ATTT AGTTG C TGCACTTTTA 558 0 

CAAGAGTTAG TAGAAGCAAA CGATGAAGTT GAAGTTCAAT TAACTTTTGA AAAACCATTA 564 0 

TCTCGCAAAG GCCGTAACGG TAAACCAAGT GGTTCTCGTA ACAGAAATAG TAAGCGTGGT 5700 

AATCCTAAAT TTGACAGTAA GAGTAAACGT T CAAAAGG AT ACTCAAGTAA GAAGAAAAGT 57 60 

ACAAAAAAAT TCGACCGTAA AG AG AAG AG C AGCGGTGGAA GCAGACCTAT GAAAGGTCGC 5820 

ACATTTGCTG ACCATCAAAA ATAATTTATA GATTAAGAGC TTAAAGATGT AATGT CTTGA 588 0 

GCTCTTTTTT GTTTTCAATA ATTGATTCTC TGTAGATATC aAAGTaCTAA CGTTTTAAAG 594 0 

GTTAAATATT TAATTGGATT GAGATCTGTA TGCGGTTATA TCaTTCTGTG TAAATATGGT 6000 

TCTCCACCAA ATGTGGTGAG TATATAATTT AAAGAACTAT TTTTAAATTA AGAATAATCG 606 0 

AACATAAATA AACTTTATGA AATTTCAGTA TCATGTTCTT ATAAAAAACA ATAGGGCTTT 612 0 

TTGctGACGC TAGTGCGCGA TAAATAATAA GTTGAATATA AAAAAGATCA CTGCCAATCA 6180 

50 TTCGTTTAAT GGCAGCGATC TTTTTTATTT AATTATTTCT CTTTCCACTG CAACATTTGA 6 24 0 

TAACCAATGC GTGGATGTGT TTTAATAATA TCTTTTGCGT CCTCATGACA TTGTGAAAGT 6300 
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CCATATATTC GTTTTAATAT CATCTCATAA GTGAGTACTT TTCCTTTATG ATTTGACAAT 64 20 

AGTTCTAACA AGCTAAATTC ATTTGGCGTC AAATGTACCT CCTGATTATT AATAACAACA €4 80 

5 

GATTTGGAGC CAAAGTCGAT GCTTAGCAAA CCGTTAGTAA ATACAATGTT AGTTTCTTGA 6 54 0 

TGTGACTTAG CGATTCTCTC GATGACTCGT A7TCGTGCCC GAA3CTCATC AACATTAAAA 66 00 

GGTTTAGTCA 7ATAGTCATT CGCACCGTTA TCTAAAGCTT GAATAATTGT TTGTTCTTCT 6660 

W 

TGTCTTGCAC TTATTACAAT GATAGGAATG TCAGTATGTT GCCTGATTTC TGAAATCAAA 6720 

CATAATCCAT CTTTATCTGG TAAACCTAAA TCTAATAAAA TGACATCTGG TTTATCAATT 67 8 0 

TGAATTTTAA AGTGTGCTTG TGTGGCATTG TCGGCTGTAG TTACATTGTA ATAATCTAAA 684 0 

GTTAATGCAA CATCAAGTAA ATGTGTGATT GCGTGATCAT CTTCAATTAT CAATATTTTA 69 0 0 

GATTGCATTA TACGTCTCCT TCGTTAAAGT CTGTATATAT ATTGAAATAG AATATACTGC 696 0 

20 CGTGTGGTTG GTTCGGTTTA TATTGTAAGT TTGATTGATG TTTGTGTAGG ATAGTCTGTA 702 0 

CTAAATATAA GCCTAGTCCC ATGCTTTCTT TTTGGTTATC TTTAAAATAT TTATTTGATC 70 80 

CTGTGTAAAA AGGCTCGAAT ATCTTTTGTt GTTCTTCTAA ACTAATTCCA GGTCCTTCGT 714 0 

25 CTATAACGGC AAATTCGATT TGTTCATAGC TAGCATAACG AATAGATAAA TTGATTTTGG 720 0 

TGTCAGTAGA AGTGTGTTTA A CTGC ATTTT CAATCAAATT GAAt AAAgCT TGTAAAATCA 726 0 

ACTTACTGTC AATGTGTATA AACtGTAAAT TTACTGAGGA TGATACAGTT ATACGCTTTT 732 0 

30 

TTAAATGGCG ACGTTCTAAA ATA CAT AT CG ATTTCTTATA CTA 73 6 3 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS : 
35 (A) LENGTH: 10470 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

■40 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



TTAACAATCG ATAACCACAA TACTTCTATT GTAATTGTTT AACGATTTCn CGATTAAAAT 6 0 

CAT CT AAAT C GTCTGGTACT CGACTTGTTA CAATATTGTT GTCTACAcTa CTGACTCATC 12 0 

AACTACATGT GCGCCTGCAT TTGATAAATC TTTGCGTACA TTTAATACTG CTGTTAACGT 18 0 

ACGACCTTTT AAATCGTCTG TATCTATTAG TATTTGTGGC CCATGACAAA TGG CAAATGT 24 0 

TGGTACATCA TTTTTAGTAA AGTATTTAGC AAATGTGCCA TATCGACCTT CTGTATCTCC 3 00 

ACGTAAATGA TCTGGTGAAA ATCCTCCAGG AATTAATAAT GCATCATAAT CTTCTGGTTT 36 0 
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ATTTGCAGTA TCTCCAATCA CTACAG7ATT AAAGCCTGCA TTCTCTAATG CCTCTTTAGG 48 0 

GCTTGAATAT TCTATATCTT CAAATTCGTT TGCTAGAATA ATTGCTACTT TTTT AG T CAT 54 0 

TGAAAATCAC CTTTCTATAT ATCATTGATA TAATTACTAT AGACAAGTAA ATCAGTGATT 600 

AAA CAT A C AA GATATAAAAA ATATTAAGCG ACTGTCGCGA TATCTAACCC TAACACATCT 660 

TATGTGGCAT TTACTTAGAT ACTAATTTAA CCTTTTCTTC AAGCTCATCT AACAATCCAA 72 0 

TCCATTCATC TATATCTTCA A CA CGTACTT CATCAGGATT TACATGATCG ATATCCTCAA 780 

TAAACTTATT TAAACGCGCT TTTATCTGTT CGATTGTTTG CTGTTCATTC ATAAAAAGTT 84 0 

AACTCCTTTT ATTTTGTTTT CTTTTTCATT ATTATCCTAA CAGAAATTGC GTTAAAGCGA 900 

TATAATCTTA GCTATATTTA TGACATTCAA ATTATTTTGA CTTTTAAAAA TCCCCTTTTC 96 0 

AATTAACTAA AATTAAGAGA TAATTTGTTA CGAGTGATAA TACGAaGkGG TaTCATACCG 102 0 

20 ATATGAACCA AATAGAAAGA AGGAAGTTTA AGACGATGAA TAGCGTCAAA TTGAAGCAAC 10 BO 

CTGTTAGCAT TTACAATGAT CCATGGGAAG TGAAATTTAT ATACATTTAA ATTTCATGAG 114 0 

ACAATAAACG TTGATTTAAT GCGTTTTTTT GCCTTTTTTA TTTTCCTTAT TTTTTCTGTT 12 0 0 

25 TTACAACAAA ATGGTATCAA AAATGGTATC ATTTGTAGTT ATTTTAGCTT CACATATTAA 126 0 

AACAACCACA CTCCTAAATT AATAGGTGGT GTGGTTTTGT TGGTTGTGTG GGGATAAAAA 132 0 

TAACCGCATC AGTTAAGATG CGGTTATCTA GCAAGGGCCA CGTATTTATA AATACGTTTA 13 8 0 

GAATCTCTTC GGCAACTTTG CTATAGACAG TCTATGCTGT TACTAAATTA TACCACCACA 144 0 

CAAACCTACT CCCATTCAGG AACACAGAGC TTTGTCGCTC GTCAGCAACG T CAT ATG AAT 1500 

TCTCAGTTCA TGTTGTGGTG ACACTTTAAA CGGTCTGTGC CAGTAGCGAC CGAGTCATTT 1560 

CAAGAATGAC CATTTCACAT TTATATTATA ACACTTGTCG TGCGTAACTG TATAGTTTTT 1620 

CAGTTGTATT TAAAGTTAAG TTATCTACTT CGCGCTTTCC TTGCCTTAAT TGTGAAATTA 16 80 

CATATTGCGC TACGCCAGTT TGTTTGTGAA TTTGGTAACC TGTTATATCA CTTTTGATCA 174 0 

ATTCAATTAT TTTTAATTTA TAATCACTCA TATTATCTAC GTCCATTCTT TTTATCTAAA 18 00 

CAATAAAAAT GTGTCTTTCT CCCGATAAAT AATAACAATG GTAGGCTTAA TAAAAACAAT 1860 

ATTAAATACA TTTGTTCTGT CATAATTGAA AACCTCCAAA TAATATTATA TTATATAAGT 192 0 

GTAAGGAGGA GCCATCAGGC TCCAAGCATA ATGTTAATCT TTGTTGTTTG GCTTTCGGTC 198 0 

TAGGTAGCCG AGATGCCaTT CTCTAAGTTG TTTTAACACT TCTGGAATTA TCAGTACTGC 2 04 0 

CAATACTTGA TGTTCTAGAA GTGTTTTTAT TATGTCTAGC ATGAGGCTTT TCACCTCCTT 210 0 

ACACATAATT TGTAAGTCAT CAACTAACCT A CAAAT AT AA TTATACTAAA CAAATGTTTA 216 0 
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GTTATCTACA TTTAAATCTT GAGAGAAATG TTAAAAAGTT CTAGTAAAAT AAT AG CACAT 2 2 80 

TTTATCTTTA AATGTAAATA GAAAGCAGGT ATGTAACGCA CCTGCTTAAA TAGaCATGAC 2 340 

TATGTCATTC TAACTGATTT CTCCCCATAA GTCACCTAAT ATCTGATTAG GTGGGGCAGA 24 00 

ACCATTCCAT GTTCTAATAG G CAAGT AAT A ACGTTGCCCC TCCCATGTAT ATCCTACCCA 24 60 

AACATGAGCA TCTTGTAACA TCACTTCTGT ATAATCACAA TACCCACCAG GTTGGAACTG 2520 

ATAACCCACT GGACAAGATA AGAATGGCCC CACTTTTCTT ACTGTGATTG GTTGATTGCC 2 580 

GTTTGTGAAT CTAGCACTTT CTTCCATGTA GTAAGTACCA TATTTATTAC GTTTCCATGC 264 0 

ACTTGCAACT GGTTTAACTG TATTACTTGA AGCGCTTGAC T CAT TAG AG A CAGTGGCAAC 2 700 

CGGTATTTTA CCATCCATGT ACGCCCTAAT CTGCTTGATA AAGTAGTCTT TAAGTTGCAA 2 760 

CCGCTTGTCT TCTGGCAATA GACCGCGAGT TACTGGGTCA AAACCAGTGT GTAAAACCGA 2 820 

20 ACTTCTATGA GGGCATGATG TTGAAGTAAA TTCATTGTGC AATCTGATTG TATTTCTGTT 2B80 

TGCTGGTAAT CCCCATTTTT TCAACAATCT AG CG CATTCT TGGAAAGTTG CCTGTTCATT 2 94 0 

TTTTAAGAAT GTCGCGTTAT CTGCGCCCAT TGATTGACAT ACTTCAATAC CG T AAT AAT A 3 00 0 

25 TTTATTACCT ATTTGATTAG CGGTATGCCA ACCTACTTGT GATTCATCTA AGGCTTGCCA 3 0 60 

AACTGTGTTG CCTGATACGT AACTATGCGC AATGCCCGCT TCTAATCTTG ATAAAGGTGC 3120 

ATTTACTAAT CCGTTACGAT ATGCTTCAGC AGTCGCCCCT TTGCTCCCTG CGTCGTTGTG 3180 

30 

TATAACTATA CCTTTAGGGT TACTACCACG CTTAGGTAGG TCATAACCTT TAACCACATC 3 24 0 

TTTGATGATT TTAAGTTCTA CTGCTTTAGG TTGTGGCTTA GCTGTTTCTT TTTTAGGTGC 3 300 

TTGTGTAGGA GATTGAACTG ATCGTGGCGC TGTCTCACTT TTAAAATTCG G A CGG AT AAA 3 360 

35 

CCACATAGGG AAATCATAAG CATGTTGTCG TCTTGTAACT TTTTCCCAAC CCCAGCCGGG 34 2 0 

TTGTTCGATT CCGTCAGTCC AGCCACCGCC TAGCCAATTC TGCTCATATA CAATGATGTA 34 80 

ATCTAAAGTT GCTTCAATTA CCCATGCAAC GTGACCATAT CCAGCACCGT AGTTGCTACC 3 54 0 

40 

GAATACCACC ATGTCGCCAG GTTGTGCTAA GAAGTCCGGT GTATTTTGGT ATACAGTAGC 3600 

TAATCCGTCG AAGTTGTTAG CGAACGGAAT ATCTTTTGCA CCTAAACCTT TTAGAAGTAA 3660 

TCCAAACAAA ACTTTCCAAC CAGCATTGGC AT AAT CAAAG CATTGAAATC CATACCATAA 3 720 

GTCCACATTG AATTGTTTTC CCTCAGAAGT TTTCAACCAC TCTATAAACT CATTTTTAGT 37 8 0 

TAATTTTGCT TGCATTGTCG CCACCTCCAT GATGATACTC ATTCACATCA AAGCCAACAT 384 0 

CGTTAGAGGC GTCTGTGAAA GGTTGTGATG TATCATATTC TTTTGGTGcT TTCGCGCTTA 3 900 

ATTC CGGCGT TAAACTACTG TCTTGTGATG ATTTCCACGT AACTTGTTGT TCTTCTTTTT 3 960 
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TTGGGTCAGT AATAACGCCA ATACCTGTAA GTAACGTGAG GATAGCGCCT ATAATTG CGC 4 0 80 

TAGCTTGATT TAATTGAGTA GATAAATC T A ATCCGAATAA ATCCGTGACT TGCTTGATAA 414 0 

ATAGCAACAA T3CTCCAACT AAACCAGTTA GTACTGCTTT GTTTTTGAAT CTCAATTTCC 4 2 00 

AGTTAATATC CATTTGTTTG CTCCTTTTAT CCAAAATAAA AAAACGACTA AAAATT AG T C 4 260 

GTTTAAAATT ATTCAATGGT CAATGTCGGA GATCCTGAAT AAACATCACT TATAGTGACG 4 32 0 

TACAACATCC CTGAAGGATT ACTAAAGTTG ATATTTTTAC TTGCAACTCC GCTATTGACT 4 3 80 

CCTGATATTC CTAAATCACT TGACCCTAAA TTAGTTTGCG AAATCCTCAT TATACCGCTA 444 0 

CGTACATTTT CTATTGTCAC CTGATAACTT TTATTGGGTT CAACTCCATT TATTGTCCAT 4 500 

TTTGCTGTTG ATTCTTCTAT GCTATCCGGA TATTTATTTT TAGGTAAGGG TTTTATTACA 4 56 0 

AAAGATGAAG GCTTTTTCCA TACTTGGATA TTTCCAGCAT ATACTTTTGT ATATTCTTCA 4 62 0 

CCTTCGTAAA TAAACTTCTT TACATTTTTA AAATTACCTT CCATAAAAAT CACCCTTTAA 4 68 0 

TTAAATATAA CGTATTCGGG TCTTTTTGAT ATAT AT AG TT AT ATT CATTT TCTGTTCCTG 4 74 0 

TCCAAATTTT AACCGTCGGT TGAGATGCGC TTTTTAGTTG ATATAAATTA TCCGCTTGTT 4 800 

25 GTTTAGTAAA AGCTTGAGAT GACAAAACAT ACCGCTCGTC ATGATTATGA TTTTTTGGAG 4 86 0 

CATATAAATC ATTTAGTGTT TGTTTGAATT CCTCAAAATC TTCTGTATTA ACTTTTGAGC 4 92 0 

CAATCTGTTG CAATACACTT TCTGAAATAG AG TTG TTTTG TATTGCTTCT GCTAATTCTC 4 98 0 

30 TTAATGTGTT CATAGATTCA GGCGCGCTAT CAACTAGTTC AGCAATTTTT GTATCCGTAT 5 04 0 

ACGTTTTAGA GTCGTTGAGA GTTGTATCTT TGATTTTTTC AACTTCTTGC AATTTATTTT 5100 

CTAACCCTTC AACATTTGCG ATATTGATTT TGTCCAATAA CTCAGGTTCT GCTTTGATAT 5160 

CTGTATCTTT ACCATCAATT TGCCACATTT TAGTGTCAGG ATTGATTGAT ACTACAGTAC 5220 

CGTTTTTACC GGGTGCGCCT TGTTCTCCTT TTTTACCTGC TTCACCTTTT GCTCCAGGTT 52 80 

GTCCCGGTTC ACCTTTATCA CCTTTCGCAC CTTTAAATCT ACTTTCATTC TTTTCGATGT 534 0 

AAGAAATGAC ATCTTTATCT ATTTTCTCTT TAAAGTCTTT GCTCAATAAA TCTGTCGCGT 54 0 0 

TAT CTTTTAA AATTCTCGTA ATAGCATCAT CTACCAATTT AACATCGATT TCTTTTGCTA 54 6 0 

CAGCAGATTC AATACCACTA TCAACGATAT TGAAAGAAAA GTTTGCGACA TGTATTTTTT 552 0 

CTTCTTCTTT CTCTAAAAAC AGCTTACAGC GAACATAACC AGCGTGTTTG ATAACCTTTT 55 8 0 

TAGGTATCTT GTAGGTAAGG AAACCTTTTA CAACATCGTC GATAATAAGG GGCTCATTTT 564 0 

TGAATATAGA GCCATCTTCC AT AAA C AAA T GTAATCTAGG TGTTAAGCCA TGTGCTTTTA 57 00 

GATCGATACG ACCTTGTTTG TCATTGATAC CT ATT CTT AT AG ATGCTG T A TTTTCATCTT 57 6 0 
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CAACATCTTT TATTTTGTAC ATTTACACAC CTCTTTATTT ATATTTATCC CTTGTGAAGT 5 3 80 

AGATACCTTT TAAGCCGATT TGTTTATATA ACTTAGCGAT TGTACTTGCT TGATGTTGGC 5 94 0 

5 

ACCACTCTAT AGCAGTAGCG TATTGGTGGG TAGCTGGATT CTTAGGATTC CATCTAATTC 6 COO 

GGTACAATGT GTTTTGACCT TTATTGATGT AATCCTTTCT TACGAAGCTA GCACCGCCCA 6 0 60 

TGATTGCTTT TGCTGGAGAT GTCCAACCTT TATTC CTTG C AAACGTCATT GCGTAGTTAG 6120 

10 

GATTGTTGTC GTAAGCGCCA ATGCCGAAGT AGTTGTATAC TCCATCTTTT CCGTTAGCGA 6180 

AG TTACTTGT TCCATATCCA CTTTCTAAGA AAGCATGCGC GATTAAATAA ATTTCATTAA 6 24 0 

15 TGTTGTGCTT TTTACAAGCT TCTGCGAACG CTTTACCTTG ATTATTCAAT GTTCCCTTAC 630 0 

CTTTAAGTAT CTTATTAAGT GCGCTAACTG AAACACCTTG ATACTTGCCT AAATTAAGCA 6 3 60 

TTTGGTAGCA TTGTGTGTTA CTTTCCCATA TACGCTTTAC ATTCATTGCT GAACTCGTTT 64 20 

20 GTGCTCGTGT AGCGTTASCC AACCCCAAGC ATTAGATTTT TTCGGGTTAC CTCTTGCCAT 6480 

TTGTTTATCC AGTGCTTGTT TGAATGTATA AGGACTCGTT TCTGTTATGA TCTGCGGTTG 654 0 

TTTAGATGCC GAACCATTGT TGGCTGTTGG TGACGAGTCT CTTACATTAG CTATATCAGC 6600 

2S GTTTTTATTA TCTACCATAA CTTTTATTCT AGATTTTGTT ACTGTTGGCT TAG TT AT AG A 66 6 0 

ATTTAATAAT TTTTCTCTGT TTTTAAATAT ATTAAGTAAT GCCTTTTCTA ATGCTTCGTA 672 0 

TTTATCTTTA GGAGGAACAC CGTTGTCAAT CATATTCCAA TTAACATGTT CCAACATTGA 67 8 0 

30 

ACGCCAAATG CTGTCGTCTA CTTTTAAATT TTCAATACTT AGAGGTATCT CATATTTGGC 684 0 

CATCATATCT ACAGCTACAA CCATTGCGTG AATCT CATTA AAAATAAATT CATTTTTACT 6 900 

CGCACTATAA TCTTCACATA CGTCTATAAC TATATAATCA GGTTCATTAG GAACTTCAAA 6960 

35 

TACAGCTCTT CTAGGTG CCC AAATATTATG TCTATCAACA TAAAAGTGGG GAT ATT CT AC 7020 

ATCCTGTTTG TATTTCTT CC TACTGTTATA TAAACTTTCT ACCGAGCTCA TCGTTTGTGC 7080 

GTTTCTAATC ATTATTC CTT TAGGTTTTTC GAGTCGTCGA TTACCTTCTA CTATAAAGTG 714 0 

40 

ATAAATATAT TCTGGATAAT TAACCTCTTG GCTAGAAATA GTGTACTTTA TAGTTGTTAC 72 00 

ATCTTTCCAA ATTGGAACTT TTTTATTATT TTTTTCGTTA TCATCACTAT CATCTTCTGG 72 6 0 

TTTAGGTGCC GGTGT AG TTT TGTCTGGATG AT ATGGTGG T C7AACAAAAT ATTTAACCCC 732 0 

4b 

TCCACCTGGT CCATCATGAT AAGAGTGTTT AATTTTATAA GGTGGACTTC CTGTTGCGTT 73 8 0 

ATTTGTATAC CAGTTTTGAT CTACGCCATA CCAATAGTCT TTTGTGCATG GTCCCACTAC 74 4 0 

50 AATGTTTACA TGTCCTGCCC AACCACCAGT CCAAACACCC CAGTCGCCTG GTTGTGGTAC 750 0 

AAAATCTTTT GTATTTCTAA TTATCTTGAA ATCTCTACCT CTATAATTGG ATTTTTGAGC 756 0 
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TAAATCCCAG CATTGTGCTC CCATTCCAGA ACCAGGTACA TCAATAGCTA TTTTGTTTTT 7 6 80 

AGCGATATAT AACGCCCATT CAACCACTTC ACTAGCTGTG GGCTTTCTAT TTTTCGGATT 7 74 0 

5 AGGTAATCCC ATGTATGCAC CTCATTTCAA TCAAAATAAA AAGCCAGTGC CGAAGCACTG 7800 

ACTCTTAACT GTTATTTACA TTTACCAAAC CAGAAGCACG CCCAGAAGCT ATATC CTAAA 7 860 

ATCCCTTTAA GCATGGTAAT CACCTCCTTT AAATACCAAA AACAGTTCTT AGTAAAGCTA 7 920 

W TGACAATCGT ACTGAAGATA GTCCCTATCA AACCTAGAAT CCACATTTTT ATGTCTCTAA 7980 

TATTCTTGGC ATTCTTTTCT TTATTCTTTT CATCTTCTAC CTTGTCGCGC TTTAATTCTT 804 0 

CAAAATTTCT ATCTAATTTG TCATAAATCT TTTCT7GCGC TCTAAGACTA TCTTCTATTC 8100 

15 

TGTCGAATTT TTCAAACATA GTCTTATCAT TTTCTTCTAA TCGCGTTAAA CGCCAATCTT 816 0 

GTTCATGTCG TTTGGTAAAT CCAAACATTA TGCCACCCAC TTTATTCAAA TTAAAAAGCC 8 220 

ACAAG CATTA CACCTGTGAC TTTTCATCTT TTGTTTCTGG ATATTTTTCT CCAGTGATTA 828 0 

20 

AAGCGTATTC TTCTTTATCG ATTAAACCCT TGTCTACGTA CCACTTAATT TGCTCGTTTT 834 0 

TATAGTAACC CCAAACATAA AAAGTTTTAA TGTCTTTAAA AGTTGGATAA ATCATCTTCA 84 0 0 

25 TTATTTAAAC GTCCCCCTCA GTACTTGTTT TGTTAGTTTT CAGTTCAGTC AACTGTTGTG 84 6 0 

TTAACATAGC GTTTTGTTGA GCTAATTCCA TTGTTAATAC GTTTACTTGT GCCACCTGCA 852 0 

TTTGCATACT CGCAACCATT CCGCGAAGTT CCT CATC ACT TAAATCTGAC GCACTTTGTT 858 0 

30 GGTTTGATGC ATTCGGTACG TCTTCTTTTT CGAAATTGCT ATTGTATTTA ATTTCGCCGT 8 64 0 

TAGTGAAAAC AAACTTTCTA GGTTCGAACT CTTCTTTAAA TTTAATAGGC ACATTGTTAT 8 700 

CATCTACATC TAAACTATTG CGTAAACCGC CAGTATTAAC GAATCCGATA ACTTCGTTTT 876 0 

35 TATCGTTTAC TGTGATTTTC ATTATTTCCA C C C CAT AATT TTAGTTATAG TAACTTTGTT 8 82 0 

GGCAJTCGCT CCAGAACCTG ATGTTTTACC TAAATCAAAG TACACATCGT TATCTATTCT 8880 

TAAAGTAGTG CTACTTGTTT TGGATAGTAA GCACTCATAA ATACCGCCAC CGTTGCCGTC 8 94 0 

40 

TGAGTCAACT ACATTCGCTT TACTCAATTG AATCGCGTTA GGTAATGCGG TTAGTCCGAA 9000 

TCCCTCAATA ACGCCACCTG GATAAGTTCC ACTTACCAAC AAAATAGAAT AGTTTGTGTA 9060 

CGGTTCAGTT AGATTGATTG TTGTACCTAC ACCATTTGCG CCACCGTCGA ACAATACCGT 9120 

45 

TGATTTATGT TCATTAGGAA CTGTCCACTG TTGCTCAAGT CTGCCGTTTG TGATTGATCG 9180 

TGTGTAAATC TTTTTAGAGT TATAAGGTGT GAAGTTAAAT AGCTTGTTTG TATCATCTTT 924 0 

AACGAATACC GATAAATAAC CCTCATAACT TTCAACGCTA CCTGGTAAAT CCGGCACTCT 93 00 

SO 

TGTTGCATAG TAATTACCAG CAGTTAAATA TCCCAAATCG CCTTGCGCAT TATTTAAGTT 93 60 
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GAA7TTATCA TCTACATACT GCTTAGCTTG ATTTAAAGCG TTGTTAGACG TTTCTTCAAC 94 8 0 

AAATTGCTTA GTTAAGTTTC CATC ATT CTT TTTATAAAAC GGGTACCATG TGCCGTAGAT 954 0 

TTTGTATTTT GTG TACT CAT CGTTTGAATC GTCTGGGTAC CATGTTGCAC GAG CAG T ATT 9600 

ATT AT CAACA ACATAAACAA CTAACACACC AGATTTGCTT GATGTATAAG TTGATTCATC 966 0 

GAACGAAGAA CCGTCATCAA CACCATCTTG TCCAGGCTTC TCTAACGTGC CTATATCCGT 972 0 

CTTTTCTGGC GCATCTGTTG CATTAGTAAT ATGAATAATC CTAGATGTGT TAACTGCGCT 97 8 0 

TAAAACGCTA TCTATGGACT GCTCATACGA TTCAATTGCT TTACCGTAAT CATCTGTAAG 984 0 

TTTAGACTTT TGCCAATTCG TTGTTGAATT ACCTTTAACA AGGTCAGCGC CATTGATTTG 9900 

TTGTTCAACT TCGTTAACAC GTTCAAAAAT CGCTTGCTCT TTTTCAACTA TTTTATCGAA 9960 

TTCAGCTGTA ACAGCTTGTG TTGCACTAGT TTGCGTCGCA GTAATAGCTT GTATAGCTTC 10020 

GTTTTGCTTG ATTTCGATTT GTTGAATGCC TTTTGTCGCA CTATCATTCA CTTTTG C TAT 10080 

TAACGTTTGT G TAT CAG CCA TATTTTGCTT TAATTGGTTA AAATCTTTAC CGACAGCTTC 1014 0 

GATAGTATCT TGAATAGATT TGATATAAAC AAGCTTTGTT ATACCATCAA ACCCACTAAC 10200 

25 TAAATCATTT TCAATATTGA AGCTAAATTG ACGTTCAACA ACAACATTAT TACTCCCGTT 10260 

TTGTGTAAAG AATGCCTGAG CATGCACCTT GCCTGAATGT TTTAAAAATT CATTCGGTAT 1032 0 

CACATACTGC AAACGCCCAT TAATTGCGTC TACTATCGTT AATTCGTCTG AAATATAAGC 103 80 

30 GCCTCTATCT ACGTTATAAT CATCGGTTTT TAAnACGATA GATGTTTTAA CATGTTCAGA 10440 

ACTTATAGAT AAGGGTCTGT TATnCTTAGT 10470 
(2) INFORMATION FOR SEQ ID NO: 21: 
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45 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3647 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ATCAGATCTT GAGAATCGAG TTATTAAGTC TATCGAAGAC TTAACTAAAA TCCAACCATT 6 0 

CATGCCTACA CAAGATTTTG ATTTTAAAAC TAAAGAAATT CAATCAAACA TTTCTGAAGA 120 

AAGATTTATC GAAATGATTC AGTATTTCAA AGAGAAAATA ACAGAAGGGG ATATGTTCCA 180 

so AGTTGTGCCA TCAAGAATTT ACAAATATGC GcATCATGCT AGTCAGCATT TAAATCAACT 24 0 

TTCGTTTCAA CTGTATCAAA ATTTAAAACG ACAAAACCCA AGTCCATATA TGTATTATCT 3 00 
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TCAAATTGTA ACAACTAATC CTATTGCAGG T A C G ATT CAA CGTGGTGAGA CGACACAAAT 420 

AGATAATGAG AATATGAAAC AACTACTTAA TGATCCAAAA GAATGCAGCG AACATCGTAT 4 BO 

5 GCTAGTTGAT TTAGGACGTA ATGATATTCA TAGAGTAAGT AAAAT CGGT A CCTCAAAAAT 54 0 

T A CT AAA TT A ATGGTTATTG AAAAATATGA ACATGTTATG CATATCGTAA GTGAAGTCAC 6^0 

AGGTAAAATA AATCAAAATT 7ATCGCCAAT GACAGTTATT GCGAATTTAT TACCAACAGG 6 60 

w 

TACCGTTTCA GGTGCACCAA AATTACGTGC AATTGAAAGA A7ATATGAAC AATATCCACA 72 0 

TAAACGGGGC G7TTATAGTG GTGGTGTTGG ATACATAAAT TG T AAT CAT A ACTTAGATTT 7 30 

TGCATTAGCA ATTCGAACGA TGATGATAGA TGAGCAGTAT ATCAACGTAG AAGCTGGTTG 94 0 

15 

TGGCGTTGTA TATGATTCTA TTCCTGAAAA AGAACTGAAT GAAACGAAAT TGAAAGCTAA 900 

AAGCTTATTG GAGGTGAGCC CATGATCTTA GTTGTAGATA ATTATGATTC CTTTACATAT 960 

AACCTAGTGG ATATTGTTGC TCAACATACT GACGTCATTG TTCAATACCC TGATGATGAT 1020 

AATGTG CTG A ATCAATCGGT GGACGCTGTT ATTATATCTC CTGGTCCAGG GCATCCATTA 10 80 

GACGATCAAC AGTTAATGAA AATCATATCA ACCTATCAAC ACAAACCCAT TTTAGGTATT 114 0 

25 TGTTTAGGGG CTCAGGCACT GACTTGTTAC TACGGTGGAG AAGTCATT AA AGGCGACAAG 12 0 0 

GTTATG CACG G CAAAGTTG A TACACTAAAG GTTATATCGC ATCATCAACA TCTGTTATAT 12 6 0 

CAAGATATAC CAGAACAGTT TTCAATTATG AGATATCATT CATTAATAAG TAACCCTGAC 13 2 0 

30 AATTTTCCAG AAGAATTGAA AATTACTGGA CGTACCAAAG ATTGTATACA GTCATTCGAG 13 80 

CATAAAGAAA GACCGCATTA TGGTATTCAG TACCATCCTG AATCATTTGC TACAGACTAT 144 0 

GGTGTCAAAA TAATTACAAA TTTCATTAAT CTAGTGAAGG AAGGATGAAA ACCATGACAT 150 0 

35 TACTAACAAG AATAAAAACT GAAACTATAT TACTTGAAAG CGACATTAAA GAG CT AAT C G 156 0 

ATATACTTAT TTCTCCTAGT ATTGGAACTG ATATTAAATA TGAATTACTT AGTTCCTATT 162 0 

CGGAGCGAGA AATCCAACAA CAAGAATTAA CATATATTGT ACGTAGCTTA ATTAATACAA 16 8 0 

40 

TGTATCCACA TCAACCATGT TATGAAGGGG CTATGTGTGT GTGCGGCACA GGTGGTGACA 174 0 

AGTCAAATAG TTTCAACATT TCAACGACTG TTGCTTTTGT TGTAGCAAGT GCTGGcGTAA 1800 

AAGTTATAAA ACATGGtAAT AAAAGTATTA CCTCaAATTC aGGTAGTACG GATTTGtTAA 186 0 

45 

ATCAAATGAA CATACAAaCA ACAACTGTTG ATGATACACC TAACCAATTA AATGAnAAAG 192 0 

ACCTTGTATT CATTGGTGCA a CTG AAT CAT AT C CAA T CAT GAAGTATATG CAAC CAGTTA 1930 

50 GAAAAATGAT TGGAAAGCCT ACAATATTAA ACCTTGTGGG TCCATTAATT AAT C CAT AT C 2 04 0 

ACTTAACGTA TCAAATGGTA GGCGTCTTTG ATCCTACAAA GTTAAAGTTA GTTGCTAAAA 2100 
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AAGCAACACT ATCTGGTGAT AATTTGATAT ATGAATTGAC TGAAGATGGA GAAATCAAAA 222 0 

ATTACACATT AAATGCGACT GATTATGGTT TGAAACATGC GCCGAATAGT GATTTTAAAG 22 6 0 

5 GCGGTTCACC TGAAGAAAAT TTAGCAATCT CCCTTAATAT CTTGAATGGT AAAGATCAGT 23 4 0 

CAAGTCGACG TGATGTTGTC TTACTAAATG CGGGT7TAAG CCTTTATGTT GCAGAGAAAr 24 00 

TGGATACCAT CGCAGAAGGC ATAGAACTTG CAACTACATT GATTGATAAT GGTGAAGCAT 24 6 C 

w 

TGGAAAAATA CCATCAAATG AGAGGTGAAT AATATGACGA TTTTATCAGA AATTGTTAAA 2520 

TATAAACAGT CACTTTTACA AAATGGCTAT TATCAAGACA AACTTAATAC CTTGAAAAGT 258 0 

GTGAAGATTC AGAATAAAAA A7CTTTTATA AACGCAATTG AGAAAGAACC AAAGCTAGCA 264 0 

15 

ATTATTGCAG AAATTAAATC GAAGAGTCCT ACAGTTAATG ACTTACCTGA ACGAGATTTA 2 70 0 

TCGCAACAAA TCTCAGATTA TGACCAATAT GGTGCAAATG CCGTGTCCAT TTTAACTGAT 2 760 

GAAAAGTACT TTGGTGGTAG TTTTGAAAGA TTACAAGCAT TGACGACAAA AACAACATTA 2 820 

20 

CCCGTATTAT GCAAAGACTT TATTATAGAC CCGCTTCAAA TTGATGTTGC TAAACAAGCT 2 6 80 

GGTGCATCTA TGATTTTATT GATCGTTAAC ATCTTATCTG ATAAACAATT GAAAGATTTA 2 94 0 

25 TATAACTACG CTATATCGCA AAATCTAGAA GTGTTAGTTG AAGTACATGA TCGCCATGAA 3 0 00 

TTAGAACGTG CCTATAAGGT TAATGCTAAA TTGATTGGTG TAAATAACAG GGACTTAAAA 3 060 

CGATTTGTTA CAAATGTGGA ACATACAAAT ACTATTTTAG AAAATAAAAA AACAAATCAT 3120 

30 TATTATATTT CTGAAAGTGG TATTCACGAT GCATCTGATG TAAGAAAAAT CTTGCATAGT 3180 

GGTATCGATG GOTTA CTAAT AGGTGAGGCG CTTATGCGTT GTGACAATCT ATCTGAATTT 3 24 0 

TTACCACAAC TGAAAATGCA AAAGGTGAAG TCATGATGAA ATTGAAATTT TGTGGCTTTA 3 3 00 

35 CATCAATAAA GGATGTTACA GCGGCCAGTC AATTACCTAT TGATGCGATA GGTTTCATCC 3 3 60 

ATTATGAAAA AAGTAAAAGG CATCAAACAA TTACCCAAAT AAAAAAGTTA GCGTCTGCTG 34 2 0 

TTCCAAATCA TATCGATAAA GTATGTGTCA TGGTAAATCC TGATTTAACA ACAATTGAAC 34 8 0 

40 

ACGTATTAAG CAATACGTCA ATTAACACAA TACAGTTACA CgGCACAGAA TCTATTGATT 3 54 0 

TTATACAGGA AATTAAAAAG AAATATTCAA GCATTAAAAT C ACTAAAG CT TTAGCTGCaG 3 600 

ATGgAAAACm TwATCCCAAA caTtAAtnAA tnTTAgGGGG TCCGTGG 3 647 

45 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5966 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
O) TOPOLOGY: linear 
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(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

CcAcCTTGAC CACCTTTACG TGGAATCTTT TCmCCTkGAG CAACaTCGaT AATaTATATT 6 0 

GAAAgTCAAC AAGTTCTGGA CTAAATG TTG CTGCTAAGTT ATCGCCACCA GATTCTATGA 120 

AAATTAGTTC TATATCGTCA TGACGTTCTA ATAATTCGTC TATTGCTGCA AAGTTCATAG 180 

ATGCATCTTC ACGAATCGCA GTATGAGGAC ATCCACCAGT TTCAACACCA ATGATACGAC 24 0 

TTTCAGGTAG AACTCCTGAA TTTACTAATA TCTTTTCGTC TTCTTTTGTA TATATATCAT 3 00 

TTGTAATAAC GCCGATACTC ATTTCTTTTG AAAGACGTTT TACAACTTTT TCAATTAATT 36 0 

GTGTTTTACC TGCACCTACA GGACCACCAA TACCAATTTT AATCGGATTT GCCACAATTA 42 0 

TAACCTCCTA TGATATGAAA tTCTAACATT GaCGTTCTCA TGCGCCATTT GATTTAGTTC 4 80 

TAAACCAGGC GCTGTCATGC CAAAATCTGC TTCTTTTAAT TCGAAAATCT GCTTTCTTGT 54 0 

20 TCCTTCTATA TAAGGAATCA TGTGAGTAAC TATCTTTTGA CCAGCAGTTT GTCCAAGTGG 600 

AATAGCACGA ACAGCATTTT GAGTTAAACT TGAAACATTT TGATATAAAT AGTAATCAAT 66 0 

AATCGTTTCA ATATCTACAC CTAAATGATG GCCTAGCATA GTAAAACAAA TAGCTGGATT 72 0 

25 

TnACTTTGCT TTCTTATCTT GCATTTGTTG ATGATACCAA GCAATCCATG GGCTATtATA 78 0 

AAGTT CT AAA GCCAATTTAA CCATGCGAGT CCCCATTTGT kTTGCACCAA CACGTGTTTC 34 0 

TTTAGGTAAG TTTTGrACAr ACATCAGTTT ATCTATGTGT AATACTTTTT GTGTATCATC 900 

30 

ATTTTCCAAT GCATCATAAA CT AaACG CAT GGCTAAACCA TCAGAATAGG TAAGTTGCTC 96 0 

TTGTAAAAAC ATTTTTAACC AAGCAATAAA AGTATGATCG TCATGAATTA TATTTCGTTG 102 0 

35 AATATATGTT TCAAGACCAA ATGAATGACT GAAAGCACCT GTTGGAAACT GTGAATCACA 108 0 

GAACTGAAAT AATCTTAAGT GTGTATGATC AATCATGAGA ATGCCCTATA TGTCTGAAAG 114 0 

CCTTATTAAC TTTACGGTCT TCTCGAACAT ATGGGATGCC TAAACTTTTT AATAAATCTT 120 0 

CAACTAAATA AT CATATTGT A CT AGCATTT CAGTCTCTGT AAATTGTG CT GGCAAATGAC 126 0 

GATTTCCTAA TTGATGGGCT ATATCTCCCA TTTCTTGCAA TGTTCTTGGT TGAATCACTA 132 0 

AAAGATCTTC TGAATTAACA TCCACAATAA TCATATTATG GTCATCTGCG TATAAAATAT 13 8 0 

CTCCATATTG TAAGTCAATA GGTTGTTTTA AACGAATGCC TATTTCAGTG CCATGGTCTG 144 0 

TAACGACTCT TTGAATACGT TTAACAAGAT CTGAATTTTC AAGGTATACT TTTTCGACGT 15 CO 

GCTTTTGTTT TTCTGAATTT GACAAATTGG CAATATTGCC TTGGATTTCT TCAACAATCA 156 0 

TTCTATGTTC CTCCTAGAAT AAGAAGTATC TTTGAGTTAA TGGTAACTCA GTTGCTGCAT 162 0 

TACTTGTAAT TTTTTCTCCA T CTACAT ATA CTTCATATGT TTGTGGATCA ACGTCTAATT 168 0 
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GACGCACCAT GCGTTTTAAA TTTAATG CAC GATTGATACC ATTTT CAT AA GCAGTTTTAG 180 0 

ACACGAATGT CATTGACGTA CTTGTAAGGT TTCCGCCGTA TTGACCATAC ATTTTACGGT I86 0 

5 

ACTTCATCGG TTCAGATGTA GGTATAGAAC CATTTGCATC GCCATTTACG GCAGAGTTAA 192 0 

TTAATCCGCC CTTTACAACT AATTCAGGTT TAACCCCAAA GAAAATTGGG TCCCATAAGA 193 0 

CAATGTCAGC TAGTTTGCCC GGCTCGATAG ATCCTACATA TTCAGAAATA CCATGTGTAA 204 0 

10 

TTGCTGGGTT AATTGTATAT TTAGCGATAT AACGTTTGAT GCGATTATTA TCATTATGTT 210 0 

CAAAATCACC ATCTAAAGGA CCACGTTGTT CTTTCATGCG ATGTGCTACT TGCCATGTTC 216 0 

is GTGTAATTAC TTCACCTACA CGGCCCATTG CTTGTGAATC GGAACTAATC ATACTGAATA 222 0 

CACCCATATC TTGCAGAACA TCTTCTGCTG CAATCGTTTC TTTACGAATA CGTGAATCTG 22 8 0 

CGAATGCGAT ATCTT CAGG A ATAGCCGCAT TTAAATGGTG AGTAATCATT ACCATATCTA 2 34 0 

20 

AATGTTCATC TACAGTATTA TGTGTATAAG GCAAAGTTGG ATTTGTAGAT GAAGGTAAAA 24 0 0 

TATTTGAAAA TGCAGCGGAT TTAATTAAAT CAGGCGCATG ACCGCCACCA GCACCTTCAG 2460 

TATGGTACAT ATGAAGTACA CGGTCTTTAA CAGCAGCCAT TGTGTCTTCC ATAAATCCTG 252 0 

25 

CTTCATTTAA AGTATCTGCA TGTAATGCAA TTTGAACATC AAATTCATCA GCAACATCTA 2580 

ATG CATG ACT CAAAGCAGAT GGTGTTGCAC CCCAGTCTTC ATGTACTTTT AATCCAATTG 264 0 

30 CTCCGGCATT GATTTGTTCA ATGAGTGCAG TTGGATTTGT TGCTTGTCCT TTACCTGTAA 2700 

AACCGACATT AATCGGTAAA CCTTCGGCAG CTTCTAACAT TCTATGAATA TGCCATGGAC 276 0 

CTGGAGTTAC AGTTGTTGCT TTAGAACCTT CTGAAGCACC AGTACCACCA CCAATATGAG 2 820 

35 TCGTAATACC ACTTTCTAAT GCGACCTCTG CTTGTTCAGG ATTAATAAAA TGAACATGAG 28 80 

TATCAATACC ACCAGCAGTG ACGATTTTAC CTTCAGCGGC AATGATATCT GTTGTTGAAC 294 0 

CTATAATAAT GTCGACATTA TCCATTATAT CTGGGTTGCC GGCATTACCT ATGGCGAAAA 3 000 

40 

TATAACCATT TTTAATGCCT ATATCAGCTT TAACCACTTT ATCGTAATCG ATAATAACGG 3 060 

CATTAGAAAT GACAAGGTCT GCAACGTTCA CGTCATCACG TGTTACACGA GGATTTTGCG 3120 

CCATACCGTC TCTAATAGAT TTACCACCAC CAAAAGTAGC TTCTTCACCA TAAACCGCAT 318 0 

45 

AGT C TTTTTC TATTTGAGCA AATAGATTCG TATCACCTAA ACGAATGGAA TCTCCAACAG 324 0 

TTGGACCGTA TAAGCTCGTA TATTGATTTT GCGTCATTTT AAAGCTCATG ATCTTTTTCC 3 3 00 

50 TCCTTTTTTA TTCACGTTTT CAGCACCGTT ATCTCCGAAT ACACCTGCAT ATTCATCATT 3 3 6C 

TTCATCAGTT GGGCGATAGA CACGTGACTC AT CG ATAGG A CCATTGACCA TACCACGAAA 34 20 

ACCAAAAATT TTACGTTTGC CAG CAT ATTC AACTAA7TGA ACTTCTTTTT TATCCCCAGG 34 80 

55 



281 



EP0 786 519 A2 



TTCGAAATCT AATGCTGCAT TTGCTTCATA AAAATGAAAA TGTGAGCCCA CTTGAATTGG 3 600 

TCGATCTCCT GTATTTTCAA CTTCGATAAC TGTTTCAGGA TGATGGTTAT TAATTTCAAC 3 660 

5 

CTCTGTACTT TTTGTAATAA TTTCTCCTGG 7ATCATTTGA CTGCCTCCTT T AAA C AAT AG 3 720 

GGTGATGTAC TGTGATTAAC TTAGTACCAT CGGGGAACGT AGCCTCGATT TCGATATCTG 3 7 80 

TAATCATGTG TTCGACACCA TCCATGACAT CTTCTTTGTT TAGAATTTGT CTACCATAAC 3 84 0 

w 

TCATTAACTC TGCAACGGTC TTACCATCGC GTGCACCTTC TAATAATTCA TCGCTGATTA 3 900 

AAGCTAATGC CTCAGGATGA TTTAGTTTCA AACCACGTGC TTTACGACGA CGTGCAACTT 3 96 0 

15 CCGCCGCCAC TACAATCATT AATTTGTCTT GCTCTCGTTG TGTAAAATGC AAATTAAAAC 4 02 0 

CCCCAATTTC ATATTAGATA CaATTTACAA AATTTATATT AATCCTAATT GTTGTGATAA 40 8 0 

ACAAGTAATA TACAAAGTTC AATGTGTAAT TAGAAAATTA TATTTTTAGC ATATCCGATA 414 0 

20 

TTGAAGCAAA CAATCTAATC GAAAACAAAT AGTGGAATAT ATTTATGTAA AAACCAAAAT 42 0 0 

AGTTTTTAAT ATAACTTTTC ATAGAATAGT AGTATATTAA TGAGTAATGA TTCAAAGGAA 426 0 

AGGTGAAAGA TTTGAAGATA ATAGATGTGC TTTTGAAAAA TATATCTCAG GTTGTGTTAA 4 32 0 

25 

TTAGTAATAA ATGGACAGGA TTATTTATCT TAATAGGATT ATTTGTAGCC GATTGGACAA 43 80 

TTGGATTAGC GGCTATTGTA GGTAGCATCA TCGCCTATAC TTTTGCGCGT TTTATAAATT 4 44 0 

30 ATAGTGAGGC AGAGATTAAT GATGGGTTAG CTGGATTTAA TCCAGTGCTA ACTGCCATTG 4 50 0 

CGTTAACAAT CTTTTTAGAT AAGTCAGGAT T AG ATATTG T TATAACAATG ATAGCAACTT 4 56 0 

TATTAACGTT ACCAGTTGCT GCTGCAGTGA GAGAAGTTTT AAGAC CATAT AAAGTTCCGA 4 62 0 

35 TGCTGACGAT GCCTTTTGTC ATTGTGACTT GGTTTACAAT TTTACTTTCA GGACAGGTTA 4 6 80 

AATTTGTAGA TACATCGTTA AAGTTAATGC CTCAAAACAT TGAAACGGTT AATTTTAGCA 4 74 0 

ACAATGATAG AATaCATTTC ATTCAGTCAT TATTTGAAGG ATTCAGTCAA GTATTTATCG 4 8 00 

40 

AAGCGAGTGT AATTGGTGGC GTATGTATTT TAATCGGCAT ATTGATAGCA TCAAGAAAAG 4 86 0 

CAACACTCTT AGCTGTTATA GCTAGTTTGT TAAGCTTTAT CATTGTAGCT CTATTAGGTG 4 92 0 

GTAATTATGA TGATATTAAT CAGGGATTAT TCGGTTATAA CTTTGTATTA ATGGCAATCG 4 98 0 

45 

CACTAGGATA TACATTTAAA ACAG CGATTA ACCCTTATAT TTCGACTTTT TTAGGTGTGT 5 04 0 

TATTAACAGT AGTGGTGCAA CTAGGTACAA CAACATTGCT TGAACCGTTT GGCTTACCTG 510 0 

50 CATTAACATT GCCATTTATT ATCGTGACAT GGATTTTATT ATTTGCTGGT ATTAAACATG 516 0 

ACAAAGTAGA TGCTTGATAG TT AAA TC AAA CCTAATATTG TTTGAATATC ACCTTAAACT 5220 

ATACAGCGAA TTGTATAGTT TAAGGTGTAT TTTTATGGAT AAAATTAAGT GCATACTTAA 52 8 0 
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GTGTTAAACT AGGAATAAAT AATTTATATT GTGTGTTGTG TGGGGTGACT AATATGAATG 54 0 0 

ATATGGATAA TTCCTTTTTA ATAACAACGG AAATTCAAAG AAAATGGATT GAAAAATTCA 54 6 0 

5 AAGTAATTAG AGATACATTT AAGGCTAAAG CTGAATATAA TGATCAACAT AGCCAATTTC 552 0 

CATATAAAAA TATTGAATGG TTAATTAAAG AAGGTTATGG AAAATTAACG TTACCAAAAG 5 5 80 

CATATGGTGG TGAAGGTGCG ACCATAGAAG ACATGGTTAT TTTGCAATCA TTTTTAGGCG 564 0 

TO 

AACTTGATGG TGCCACAGCA TTATCTATTG GTTGGCATGT GAGTGTCGTA GGACAAATTT 570 0 

ATGAACAGAA ATTATGGTCT CAAGATATGT TGGAGCAATT TGCTGTTGAA ATTAATAATG 576 0 

;5 GTGCATTAGT TAATAGAGCA GTTAGTGAAG CTGAAATGGG TAGTCCAACA AGAGGGGGAA 582 0 

GACCAAGTAC ACATGCTGTT AAAGCTGATG ATGGGTATAT TTTAAATGGT GTGAAGACAT 588 0 

ATACATCAAT GAGTAAAGCA CTAACACATA TTATTGTTGC TGCTTATATA GAAGAATTAG 5 94 0 

20 AAAGTGTTGG TTTTTT CTT A GTAGAC 596 6 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
2S (A) LENGTH: 17310 base pairs 

(BJ TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CTGTGTCATC GCGAAATAGT TAGGGTCATT CATTAATCCT TTTGAACGTA TTTCATCAAA 60 

35 ATATAACAAT TT CATTAGT A AAGGGGACTT GTTCAAACCA GCTATAATAC AAAATAGACC 12 0 

TATAGTCACA CTGCTTATAA TATAAGAGGT AACGATCACT TTTTTGCTAT TAC CTAACTT 18 0 

AAAG5TGATC ATCCCTAAAT AGAAATAAAT GACTACAAAT GCATATTTAA CTGTAGATGC 24 0 

40 AAGAACTTCC TTAACCGTAA TAAATATCAA ATCATCAAAA AATaGCaAAC AArGCGTAAT 3 00 

AAT CAT ACG A TATGTATACA AAATAATGAm AAACTGTmAA AAATGATTTG CCTTTAATAA 3 60 

ATGGTTAGCG AAAAACAGTA AATAAACTAA TATTAGTAAT GTGATAAAGT C AG CT AT AG A 4 20 

AACATTCACA CCGGCAATAA CCGAAGATTG CTGAATAAAA ACCGCTAAAC CGATAAGTAA 480 

CAATGTTAGT AATTTACTAT TGTGTTGATT TTCCATTATA AACGTCTTCC ACTTCTTTAA 54 0 

TCATTTTCTC CTCAGTAAAA CATTCTAAAT AACGTTTTCT AG ATT G ATT A CTCATTTTGA 6 00 

TGTAATCACT GTCTATTAAA TATTTTTCCA GGACTTTAGC AATAGTTTCG GGTTGGTTGT 660 

TCATCATACA TATACCATTA TCAGCTACTA ATTCTGAAAT ACCGCCAACA TGACTGGCTA 72 0 
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TTATTAAAAT AAACGTATCG TATTGTGATA ATAAATGACT CGCATTAATG ACATTGCCCA 34 0 

AAAATGTGAC ATCATTTTCT AACCCAGCTT GTACAACTTG TTGCTGACAA TCATTTAATG 90 0 

5 

TAGGTCCATC GCCTATAAAT GTAAAATGCG CATGATTACT GTTATGTAAT TTCAATAT CT 96 0 

CTATTGCCGC GATTAGATTT TGTGGCAATT TTGGATAAGC AAATCTTGCA ATCATAACAA 102 0 

ATTGATGCTT TGTCGGGGCA TTAATCTGTA AATCTTGTTT ATTAGGCAAC ATTCCAACTA 108 0 

w 

CTTCGCCAAT AT7GTTATGT GATTGGCTTT TTAGCGTTTG CTTAACAGCG GGAACATCTG 114 0 

CAATACCATT ATGTATTGTG GTTAATTTCA ATCGATTAAA TCGATATTTT AACG CTAACT 1200 

is GTTTATCGAA ATCTGAAACA CAAATAATGC TATCTGTAAT AAGTGACATT AATTTTTCGA 126 0 

TAACTAAATA TAGAAATTTT TTAGCTGGTT TAACACCCTC TGTAAAAGCC CATCCATGTG 132 0 

CAGTAAAAAC TATACGTGTG TCTTTCGATT TCGAAATGAa CTtCGCAATT CGTCcGACCG 13 8 0 

20 TtCCAGCTTT GGAAGAATGT AAATGGATAA CATCAGGTTT AATTTTCGAG AATAACTGTG 144 0 

CTAACACTTT GACAGCTAAA ATATCTTGTT TAAAGTCAAT TGGAC CTACT AAATGTTCGA 1500 

TAATAATTAC ATTAACTCTT GCATCTAGTT GTTCAAT CAT TGGTCCATGA TTGCCTACAA 156 0 

25 

TGACATAAAC ATCATTGTGT ACGCAAAAAT GG1TGGCGAG TTGAATGAGA TGTGTTTGTG 1620 

CACCACCATT GTCTGCTTTA GTAATACAAT ATATAATTTT CAACTGTTAC AAACCCCTTT 168 0 

3Q AATGCTATAC TTT CAATTTC TTAACATGGC TATCTCATCA GATGAATAGT ATTTATAGCC 174 3 

ATGCAAATCA ATGATGGCAC ATATTTCTTA ATGCCATTTG AT AC TG T CT C AAGGGATTCC 1800 

TCGTTATACT GTAACAATTG GTCACAATCT TTAAAATATA ACTTTTATTT GAACTTATTA 186 0 

35 AGTAAATTAA GACTACCTTG AGCCTTCCCC TGTAATAACA AC CATCAATG TT CTAATTG A 192 0 

TATATATAGT TCCATCATTA AACTACCTTT ATGTATATAT TTCATGTCAT ATTTCAGTTT 198 0 

TTGTTGCGGT GTTAAGTCAT ATCCACCTTG AATTTGCGCA AGTCCTGTTA ACCCTGGTGT 2 04 0 

40 

AACAAGACAT CTTTGCTCGA AACCTATCAC TTCTGAACTA AATAATTCTA CAAATTCCGG 210 0 

ACGTTCCGGG CGTGGTCCAA TAAAACT CAT TTCCCCTTTA ACAACATTAA TTAGTTGTGG 216 0 

TAATTCATCA ATGCGTGTTT TACGAATAAA CTTCCCGACA TTTGTTATAC GAT CAT CATC 2 220 

45 

TTTATCAGCC CATTGCGCAC CGTTTTTCTC TGCGTTTTTG CACATCGAAC GTAATTTGTA 2 2 80 

TATTTTAATT AATTTACCCA TCTTCCCAAC TCTAACCTGA CTATAAATAG GGTTTCCTGG 2 34 0 

50 CGAATCTATG ACGATAGCAA TGGCGAATAT AACCATAATC GGTAAAGTTA AAAATAATAA 24 00 

AACAATGCTT AAAATTAAGT CAATCGCACG TTTAATTGGG TAATAGCTTT TTCTCACTTC 24 6 0 

TTCTAGTTTG TCTAATTTTC TTTGATAGGC ATAACCCTTA TTATTATGGA CAGCTTCAAT 2 52 0 
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AATTAAAGTA ATCCTTTAAA CCTGTTTCTA CTGTATATTT AGGAACAAAT CCTAATGCCT 264 0 

TTAAGTTAGA AATATCTGCA TAAGAATGCT TAATATCTCC TTTTCGTGCT TCTTTAAATT 2 70 0 

5 

CATGCTCGAC TGATTTTCCA TATAATTCAC CAATAATACG ATAAACCTCT AATAAATTAG 2 76 0 

TAAAAGTGCC TGTACCAATG TTATAACCGT GTCCAATTGC ATCTTTGTGT TCCATAATTA 2 82 0 

AGCGTACAGA TTGAACAACA T CAT AT A CAT ATACAAAATC TCTAGTTTGC AGTCCGTCAC 2B80 

w 

CAAAAAATGT AAATGGCTTG TTATGCTCAA ATGAATCGAA CATCTTTGAA ATCACACCTG 2 94 0 

AATATTGTGA CTTAGGATCC TGTCTTGGCC CAAATACATT AAAAAATTTA ACAACCGCTG 3 000 

is TTGGTATGTT ATATAACGAA CAATAATTTA ATGTCGTCCG TTCGCCGTAA TATTTATCTA 3 060 

TTGCATATGG TGATAATGGT AAGATTAATG ATTGATCACT TTTAGGCAAA TCAGGAAGAT 3120 

CACCATAAAC AGCTGCTGAC GAAGCAAAGA TAAAACGTTT TATATGATTA TTATATTTTT 3180 

20 TAATGATTTC TAACAATCTT AATGTTGCTA CGACGTTTAT TTCTTGAGAT AAGATAGGTT 32 4 0 

TCTCAACCGA CTCAGCAACA CTAACTAATG CTGCTAAATG AATAACATAA TCAAATTGAT 3 3 00 

ATGTCTTCAT GATTTGTTCA ACTGCATCAT ATTCACGAAT ATCTAATTCA AACACATGAT 33 60 

25 

CGTCAGCCAA ACTTTTAATA TTTTCTCGTT TACCTGTTCT ATAGTTATCT AGAACATAAA 34 2 0 

CATCATAATC TTG TTG T AAA TCATCTACTA AATGCGACCC AATAAAACGA GCCCCACCAG 34 8 0 

3Q TTAT CAAAAC TCTTTCCAAA TCTTCCACCT CATTTATACA TTAAAAATAT ATCATAAAAA 3 54 0 

CATAAAGTAT TGTAAGCTTT TTATCGATAT TTTTTATTTA TAAAAATAAA ATGAGATAAC 3 6 00 

TTTGTGAATT TTTATTGAGA TAAATTAGAT AGTGGTGTTT TTGTGATGTT TTATAATATC 3 660 

35 TTGGGTGTGT TAATACTAAT AATGCTTTCA ACTGATGCAT TAGACTGTGA CATCATAACT 3 72 0 

CACTTAAGAA CTTCGCTTAT TAATTTTCTA CCAATACACT CCCTTCTAAG TGCACTAAAA 3780 

AATCCTTACT GCTAAGTGAT TAAACTTAAC AATAAGGATT TATTTATCAT TAGTGGATGA 3 84 0 

40 

TTATTAACGG AATCTCATAC CACCATCTAC AATAATTGTT TGTCCAGTAA TGTAATCAGA 3 90 0 

GTCTTTACCA GCTAAGAAGC TCACTACATT TGAAACATCT TCTGGTTGAG AAACTCTGCC 3 96 0 

CAAAGCAATC TGACTTGTAA ATTGTTCCCA ACCCCATGCT TCAGGTTTAC CTGCTTCTTC 4 020 

45 

GGCTGTTGCC ACTGCGATAC TTTCCATCAT TGGTGTTTGA ACGATACCAG GTGCGAATGC 4 080 

ATTCACAGTA ATACCTTCAG ACGCTAAATC TTGTGCGGCT ACTTGTGTTA AACCTCGCAC 414 0 

50 TGCGAATTTT GTACTGCAAT ATAAAGACAA GCCTGGGTTA CCCTCAACGC CTGCTTGAGA 4 2 00 

TGTTGCATTG ATAATTTTAC CGCCATGATT GAATTTTTTA AATTGTT CAT GTGCGGCTTG 4 26 0 

AATACCCCAT AGCACACCTG CAACGTTCAC GCCATATACT GTTTTAAACT GTTCTTCAGT 4 320 
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25 



GCCAAATTGC GCGGCAGTTT GTCTTAcTGC G TTAAAT AC A TCATCACGGT TTGATACATC 4 44 0 

TGCTTTGATA GCAATAGCTT TTGTACCATC ACTTGATAAT TTAAGTGCAG CTGCTTTTGC 4 500 

5 CCCTTCTTCA TTGAAATCAA CAACTGCTAC TTTGAAACCA TCTTCCACTA AACGTTCTGC 4 56 0 

AATTTTAAAA CCAATCCCTT GTGcTCCGCC AGTTACTAAT GCTACTTTGT TGTTTGTCAT 4 6 20 

AAAGATCACT CCTCAAATTT CTTTCCTTTA ATTACATTTT ACTCCTCTTC ATTTGAATAG 4 6 80 

w 

TACAACAAAG GTAGCTCCAT TTAACAAAAT ATTCAGATAT TTAAGGTATA GTTAAACGCA 4 74 0 

CTACCATTAG TGATTGGCAA TGCGTTTAAA TGTCGTTTTA AAAGTTCTTA TGTTGAATAT 4 800 

15 TATTTTTTTA AGTCTCTCGA TTAGTTTGTC ATCAATCTTT TTTCGAGACA TGGTCTTTTG 4 86 0 

ATTCAATAGG CGGTTCCGTG TTATCACTGA CAACTTTAGT TGTAGCTTCA TCTTTATGTA 4 920 

TTTCTTCGTT AAATCCTTCA AGGTTTTTAG TCGTGGGATT TTTAACCTCA GGATGTTCCA 4 980 

20 TCATGTCTTG ACTATCAAGT TCCTTTTTAC ACGTGTCTTT ATGTGATGCT TGATTTGCGT 504 0 

TCCCTTTACT TTTTTGAATA GTGGTAGTAT CTGCTGCAGC TACTAATTTT TTTCTACCTA 5100 

AAATAGATAT GGCTGAAACA AACCAGAGTA TTGCAGATAC AAAGTTGCAT AATACTAAAG 5160 

CGATAATAGC CAATACAATT AATATGACAC CTTTTGAAAT CCTTTCTTTA AATAAGTCAG 522 0 

ATGCCAATAC GATGACAGGT ACGATTGAAA GTATAATTAC AAATATAGAA ATTATTGCCG 52 80 

ATATAACTAT TGTTACTATT AAATAATCAG CTCTGCTACC TGATAATAAA T AG AAAAGG C 5 34 0 

CGAAAATTAG TCCATAGCAA ATTACAAACC CACATAAAGT TATAGCCATG AGTACTATAT 54 00 

AAGCTATTTG AAAATATAAA CCTATCTTTA TGAATGATTT TTCTACATTT TTTTCCATGT 54 60 

CTATTCCCCA TTTATTTAAA ATTTATACTT TACCTTAAAT ATTCTCTTTA TTCTTTAGTG 552 0 

ATTTTATCTT TAGATTCAAA TTGATTCTCT GTACTTTCAA TATCAACTTT TTCATTTTCG 5580 

TCTGTCGATT CATCTTTTGA GTATTTATTC CAAATCAGCA AAATACCACC AATCAGCCAT 5 64 0 

AAAATTGACG AAAGGAAATT A T AT AAA CAC AGTGCAATAA TAGCATAAAC AATAAAAAGT 57 0 0 

GCACCTCCGA TTACAGAGTA ACTTTCCATA TAAATCG CAG TAAAGATGGT TGGTAAAACA 576 0 

GTGAAAAGAG CCAATATTAA TCCTAATAAA AAAATTGTTT CGTAATCAGA TCCTCCAGCA 5 82 0 

ATATTAATAG ATATCATCCT AACAAAAACG ACACTAAAAT ATATTTGAGC TACGATGCCT 5880 

ATCCAAATTG CTATTTTTCC TATAATTGAG CTCATACTCA TTCCCCATTT ATTTAAAATT 594 0 

50 TATACTTTAC CTTAATATAC CTTATTTTAT TTAATTTTTA TATGCAAAAT ACAAAAATGG 6 00 0 

AGAACTTCAA TATTTATAAA ATATCAAAAG TTCTCCACAC TATATTGTTT TATTATATTT 6 06 0 

TCGCTATCAA TACGCTAAAT CATCATATTT CCCTCAACAT CACAGTAAAA CTATTGCTCC 6120 
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TTCCAATTGC GCAGTTGTTC AACATCATCA 
AGATTAAGAG ATCGTCCTGA AATATTAAAG 
5 TTATGAACAA CCGCTTCAAT TTCCTTATAA 

TGTTGAGAAA GACAAGGATA TGTACCTTGT 
TAACTTGCGA CAACCTTTTC CCATACTTGA 

w 

AAATATTGTT CTGTATCACC ATGACACATT 
GTAGTCCATG GCAAGCGATG TTCTTGTTGT 

, 5 TTATGTTGCC ATGTACTAAT TGAATATTGT 

TTACATCCTA ACGCTTTCAA ACTTGTATAC 
CCATGTTGCA TCGCTGTCAC TAAAATAGGA 

20 CTTTTCGTTT TTTCCAATCT TAAAGGTTCG 

GGTAC CAATT TTAAATGTTC ATGAATATGA 
GTTAAATAAA TAAATTCAGG ATGTGGATGG 

25 

CCGTATGCAC CTG CAT ATTT GAAAACAATA 
ACTTCTCTAG CAAAGACATC TTTCGGTGTA 
CTCGAAATTG AAACTTTTTC AAATGAATAT 

30 

GGATGGTTAT GTTGCCAAGA TACCGGCAGT 
GCATACCAAG CACCATGTAC TTTCTTAATG 

35 TGTGCCACAA TAAAGCGCCC ACATTCAAAG 

ACGATAAGTG TTTTAAAACG TTCTACAAAA 
GCATAGTTAA CGCCTATGCC ACCACCAAGA 

40 TCAGACCATG CCTTTGCTTT TTTAAAATAA 

TCTAAATTGT TAGAAATAGA ATGAAAATGA 
AGCGCAgcTT cAATGACATC ATCAACTTCG 
CCTGCCATAT GCAACGTTGC ATTGGGAAAT 
TGTTGTGTCT TATCTTCATC TTCTAAGATG 

50 TCAACATGAA TACGCTGAAC ACCTTCACTT 

CCAGGGCCAC CAAAAATAAT ATGATTTGCT 
CCTTGAGATG CAACTTCGAA TCCTTCAACA 

55 



TCTTGTTTAA GTAATGCCAG TGGTACTTGA 6 24 0 

CGTGTCACAC CTGCTGGCAC AGTTTCCCCT 6 3 00 

CTCAATGGCT GATACTTCAT GAGTACATCT 636C 

GCAATTCTCT CTACAGAACA ACAACCACTA 64 2 0 

AAATGTGCTT CGCCTAAATC TTTTGTATAC 64 8 0 

GTAATAAATG GCGCTTCTTG TCTTGTCTCA 6 54 0 

AACGTTTCCC ACCACACACC AAATGGAACT 6600 

GTTTCATGGA TTTCTTGCAC TGGAACTTTC 6660 

CGATGCACAC CATCTATAAC CATATATCTA 6720 

TGACGTATAA AATCATCTGC TT CAAT ACT A 6780 

AATGTTTCGT GAAGATCAAT CTTATCTACT 6 84 0 

TTCAATAGTT ATTCATCCTC CTTTGTTTGT 6 900 

CTTAAGAAAT CGTGATGTGA AATAGACCAT 6 96 0 

ACGTCGCCTG TACTGATTGC GTCTATCTGT 7020 

CATAATTGAC CGACTAACGT TGTGTCCTGT 70 80 

GGATTGTCCT TATAGCGATA AATGTCAAAA 714 0 

CTAAATTGTT GCGTACCTCC TCTTAATATG 72 00 

TCTAGCACTT CTGTCACATA GTAACCAATA 72 6 0 

TTCAATGTCA CATCTTCCAT TTCTTGCTCA 73 2 0 

TTATCCCATT CAAATTGGTT AGTTAAATCT 73 8 0 

TTGATATGTT TGAGTGGAAA TCGATGTTTT 74 4 0 

AGTTTCACTA CATCGACATG TAAATTCGAG 750 0 

AATCCATCTA GATGAATCTT TGGCATTGCG 7 56 0 

T CTT CAG AAA TACCAAATTG TGTTGGGCGT 76 2 0 

GGTCCTGCTA AATTAACACG CAATAAAATG 76 8 0 

GCATTTAGCC GTTGTAATTC ATGCATACTT 7 74 0 

ACTGCATATC TTAGTTCCTC GTCTGTCTTA 7 800 

GGTTTAAAAG CAAGACCTTT TGCTATTTCA 786 0 

TACTGACTAA TTGTATCTAG G ATTTTT CGT 7 92 0 
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w 



TGTTGCAAAT GATGTTCCAG TCCGACTAAA T CAT AG AT AT AATGACAAAC TGGATGAGAT 804 0 

TGTGCTTTTA ATTGTTCAAT AACAGGTTGA ACTATACGCA TTAGCCTTCA TCCCCTTTCT 8100 

GTTTAGACGT CGCTAGAGAT GCACTTAAAT GGCGATATAT TTTTCCGCGA TCATCACCTA B16 0 

AAATAAATGT TTGTACACCT TGTGCCTGCC ATTTTGCAAT ATCTTCATCT TCACGTGGTA 8220 

ATGCACAAAA ATGTTTACCA TGTGCATTCA CAACTTCAAA AATATGTTGA ACATGTGATG 8280 

TTACTTGATC ATCACGCGTT TGCCATGGTA TGCCAAGTGA CTGCGATAAA TCTGCGGCAC 834 0 

CTTCGACTAT CATGTCTAAA CCTTCGACTT GTGCTATATC GTCAATGGCC ATAACCCCTT 8 4 00 

f5 CAACATCTTC TATCATGGCA ATCACCATAA TATGCTCATT AGCCATCTCC ATTGCATCAA 84 6 0 

GTAATGGTGT ACGTCCAAAT CTTGCCATGC GACCACCATT CAAACTTCTT AATCCTTGCG 8 52 0 

GGTAATAACG ACTTAATTTC ACAATATGCT CAACTGTCTC ACGATCTTTA ACGTGTGGCA 8580 

20 CAATAATACC TCTCGCACCC ATATCCAACA CTTTAATGAT ATCTCTATCT ATCACTGCAG 8 64 0 

TGACACGTAC AATTGG TATA ATATGCGCTG CTTCAGCTGC ACGAATTAAA TGCGCTAGTG 8 700 

T CTCAT CATT AATCGCCACG TGTTCTGTAT CAATCACAAC AAAGTCATAC CCGCTTGCTG 8 76 0 

25 

CGATAACCTC GATCATCAAT GGGTCCGGTA TAGAATTAAA AATGCCATAA ACTGAATCAC 8 82 0 

CATTGTTTAA TCTATGTTTC AGAGATAGTT GTTGCATCAT TGATACCTCC TACACCTAAT 8 88 0 

GGATTTGTAA CATGATGAAT TCTTAACTCG GAGTCACTTA ATAATCGACG TGTCGTTAAC 8 94 0 

30 

TTTTCAACTT GAATCGTAGG TTCAAACAAA TCGAAATGTT GATAGTTATT CAACTCTGGA 90 0 0 

AATGCTTCTT GATACGCCTC GATGATGCCT TTAACCCATT GCCATTGCAG CTCCTCATCG 9060 

35 ATACCATATT GCTTTTCAAT AAATAAGATG ATTTCGGCGA TATTAATAAA GAAAAATGCA 912 0 

TCATGTAAAA AGTCGCGTAC TAAACGTTCG TCATCTGTTT CAATAAATGA ATTACTATTC 918 0 

ACTTTTTTAT GTGCTTCTGG CATTGG CTTT AATGTCAGGT GTGAAGCAGC TTCACTTAAA 924 0 

TGctCACGCT TAAAACGAAC ACCATCATGG AAATCTTTTA AGGCAATACG TGTAGGCCAA 93 0 0 

CCATTTTCAT GAATGAGCAT CATATTTTGT GCATGCGATT CAAAGGCAAT ACCGTGATAA 93 6 0 

TAAAGCATAT GAATCATTGG ACGAATCGCT ACAGCTAAAA ATTGCTTTGT CCAAGCTTCA 94 2 0 

GAACCATATT GTTTAATCCA ATTTTCAATG AATGGTACAC CATCCTTATC ACTTGCATAA 94 8 0 

AGTGCATTAA ATGGTATCGC ATCCTCTTCA TCGATTAACA TATGATATAT ATTTTCACGC 9 54 0 

CATATAACAC CTAACGCACC ATAAACTTGA GTTTGTTTAT AAGGCGAAAG TTGTGTATTT 96 CO 

AAATAAGACT GTCCTAAGAC TTCCCCTAGA AAAACTGTCT TTAATTCATC TTTTAAATAC 96 6 0 

ATATCTTGTT GCTGTATCTG CTTT AA C CAA TCCGTAATTT GCGCTGCATT TTCAATTGTA 972 0 
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TATTTTGTCG TGTCTATTGG CGACATCGTA CGAATCGATT GTTGAGGGTG ATATAGCTCA 98 4 0 

TCACTTTCCC CTAACCATAG TACTGTGCCA TTAAGCCTTT CTTCAGCCAA ATCAACTTGG 99 0 0 

5 ATGACATGTT CAAACTGCCA TGGGTGTACA GGTATCATCT CAACATCATT TACATGTTTG 99 6 0 

CCAGATGCTT CAATTTGCTG TACAAAATGT T CAT AAGTCT TATCGCCAAC TTGTTGACGT 10C 2 0 

AACATTTCGT TAACTACAAC ATTTCTTGAT ACCGTCGTTT CTACTTTATC TTTGTCGATA 100 8 0 

10 

GCTAACCACT GCAGTTTAAC GTTTGGTACA AAATCAGGAC CAAATTTCAA ATTATCACTC 1014 0 

AACGTAAATC CTAAACGTGA TTTGTAACTT GGATGATACT GATGCCCTTC CATCGCATAA 10200 

t£ . AATTCATAGT CGTTAAATGT CTCAGGTGTT GCTGGTGGGT TTGATTCTCG ATACTGCATA 10260 

CTTTGCGTAT CTTTTAATTC TGTCTGTAAT AACTCGACAA TAAATTGTTC TAGCTTTTCA 10320 

TCATTTTTAG GAAATGTAAA TACAACCTCT CTCAATAATT GTGTATAGTC TGTTGTTGTA 103 8 0 

20 TCTGCCTCAT CTCCTACGAC ACGCTCAATT GGTGATGTGA TACGTATACG ATCAAAGCTA 10440 

TGTGTCTTTT CAGCAGTAAA ACGATACTCT GAATCATGTC CTTCTATTGT AAAATGACCG 105 0 0 

ACACCGTCTT GATATGACGC TTTATACACA ACAATATTCT CATAAATAAG TGATGAT AC C 10560 

25 

AGTTGGTGCA TCACTCTAGT CTTTACACGA TTAAGAATTG TTTGATTCAC AATACGATAC 10620 

CTCCTTGTTA TGACAAATTG GATTTGGTAT ATGTGTATAA ATAGGGTTTG CACCACAATC 10680 

ATTCAATTTA CTCATCAAAT TCGCTTTAGC CGcAATGGTC GGCGTTTGAT ATAAATCTTC 10740 

30 

TACACAGTCA ACAAATACTG CGTTATTCGC GTATTCTTTT TTCCAAGTCA TAAGACGATG 108 0 0 

CGCTACAAGT TGCCATAACA CAACTTCATT TCTAGTCGCT TTAC CAATAG TTGATACTAA 108 6 0 

35 ATGTCCTAAG TGATTTACTA CAACGTAATA TTTAAGACGA TGC CATGCTT CATCATGTGC 10 92 0 

ATATACAACA GGGCTTGATG CTGCCACAAC ATTTGGCACA AGCTGTTTTT CAGTAGCAAT 10 98 0 

CGTTCTAGAT AGACAAATGC CTTCAAGATC TCTGACAAAG CATACGTCGG GTATGCCATC 11040 

40 TTTTAATTCA ATTAATGTAT TTTGTACATG TGCTTCTAGA CTAATGCCTG TGTTACTAAA 11100 

CAGCTTTAAT ATCGGCAATA ATGTACGATT CAAATAACAT TCAAGCCATG CTTCTGGTGC 11160 

TAAACCACTT TGCTCAATCA CTTGTGATAA CTTAGACATC GGTGAATCAG GCATCGTTTC 112 2 0 

AAATAATGAC GCCAATACAT GAATATCTTT ATCAGCATGG TAATTCGGTA TCCCTTCACG 112 8 0 

AACAATCATG GCACTATTTG TTAATAAATC CATTTCAGGT TCAACTGTTT GCCCTAATGG 113 4 0 

ATTCGGTAAC AATGCACGAT ATCCTTCTTC AAACATCAAT TTAAAATGGG GTGTTTCAAC 114 00 

CTCATCTTTG ACTGATGCGA TAACTTGCGC GGCATCAATT GTCCGTTCAA TCTGTTCAAG 114 6 0 

GTCATTCGTA CGTATAAAAT TAGTGATTTT AACGTGTATC GGTAATTTTA AATAAATGTT 11520 
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GCCAAGGTCT TTTATTAAAC CTTGTTCACT 
CACATTGATT TGATAAGGAT GTGTTGGTAA 
ATCTATGTCT GCTAATTGAT ACAACACTTT 
GCGCGTGAGC AGAACATCTT GATGCACAGC 
TTCGGGTGCA TATTTCTCTA AATCTGCTTC 

w 

ATGAAATGGA TGACCTAAGT ATAAAGATTG 
GTCTATTGTG TTACTTTGCA AATAACGTGC 

;5 CATAATTTGC GCCATATGTT GTTGCACTGC 

TTGCAAAATA CGCGCAATTG CTTCTTTATA 
AAGCCATACC TCTGGATGAT ACATATGATG 

20 CGTTAAAGTT TCGAGCTCTG ATAATTGTAT 

ATATAAATTT TCTTCTCTAA AATATTCATT 
ATGTTGTATT AATTCTTTAT TTTGCACTTT 

25 

TGTGATCGTT GATTTGATTA GTGATGGTTG 
ATACTACGCC CATAACGATA AACGTAGTAG 
CACTaAGACT GCCAATAATT TGACCAACAA 

30 

TGCCTTTAAG TTGTTGATGA CACGCATTCA 
CACTATATGT TAATCCTTGA AGTATTCTTG 

35 AACCTTGCAG TATCGCACTA CAACCACATG 

CATATGATTT ATCATTAAAG CGTCCCCATA 
ATGCGGACTG TAAAAATCCA ATCACACTAC 

40 AAG CAAGTGG TGATAATGCA GTTAGCATGC 
CGATAATAAA TCGACATGTT TGTTGTGTGC 
CTTTATTAAT ATTTGGTGTT TGTGATTTTG 

45 

CACCGAAAAT ACAGACAATA AAAGTAATAA 
CTAATATCGA AGCTGTAACA CCGCCAATTA 
50 AACTTTGCAG TCTTCCTAAT ACCTTTCCAC 
ACGCACTTGA TGCATCAACA ACACCACCAA 
ACTGTAATGG TGTCGTACAC AATGCCATTA 

55 



ATATTGCATA TACTGTGGAT GCTGTCGCAA 11640 

TAAAATAAAA TCTTTGGGTA TCTCTGATAT 11700 

CTCAACCTGA TCTTCTTTAC CTTCTACATA 1176 0 

TAAATAATGC AATTGGAATG ATGTATGACA 11820 

TGAAAACCCA CTTGCACTCT TAGGAGTCGG 11680 

TTCTGAAACG ATATAACGAT CCTCTACGTA 11940 

CGTGCGATGA ATGCTATTAT CGATGTCAGA 1200 0 

CGTTTGATTA TCTGCACTTT GAGCCATATG 1206 0 

AGTTGTTATT TTTTTACTTT TTC CATC GAT 1212 0 

CCCCATCGCA GACCAATAGC GAAATTCACC 12180 

AGACCATTGA TGATTTTGAG GTGGTACTTG 1224 0 

TAAAATGCGT TCGATAGCCG CATACGCTGC 123 00 

TTTGTTTCAA CTCCCATAAT TTCATTAATG 1236 0 

AACAAATTAA AAATAAACTA CTTACTGCAA 12420 

CTGGTGTAGT ATAACTTGTA ATGGCAGCGC 12480 

CTAACATACT GTTCGTCGTT CCAACAAATG 12540 

CGACAACAAA CATGACACTT TGAATCAATG 12 600 

CAGCCATTAA AAACTCTATA TTCGTCGCTA 1266 0 

CAATCGTGGC AAATATATAT ACTGATTTAA 12720 

AAGGCGCGCT TAATATCGAA GCCGTCCAAA 12780 

GGTCATCTAT CG CTGTATG A TTCACTGATG 1284 0 

CATACATAGC AAAGTTTGCT AAAACGCCAA 12 900 

ATAATAGACA TTGAAATGAA CGGCGAATAC 12 96 0 

GCATATGTGT CGTTTCAATC AATTTTAATG 13 02 0 

CGG CAATACT CATCAGTAAC GCACTAAAAC 13 080 

ATGGCCCCAC AAGAGACCCT GCGCTGACTG 1314 0 

GATCTTCAGC TGGCGCCTCT GCACTCGCAA 13200 

ATAGTCCCTG CAATAACCTC ACAAG TACAA 13 2 60 

AAAATAAGCA TACCGCCAAA CCAAGTAACG 1332 0 
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CtATCATCGT CGTTACAGCT GGAGCAGCAA TCGCTATACC ACTCCACAAC TGTATTTCTA 13 440 

CGACTGATAG ATTTTGTAGT GATGCCATAT AAATTGGCAA TAATGGCACA AGTACTGTCA 13 500 

5 

GTCCAGCAAT CGCTATAAAC TGACTGAGCC ATAAAATGCG AAAGTTACTG CGCCATATAG 13 560 

ACTGATTAAT CATATGTCAC CATTGGATTT GGTACGGTAG TTAAACCTGA AGGCATACTA 13 620 

w CCTCCACCAC TATCACGTTG AT AT AG CAA T GGTAATAAAA TTTGTTTGAA TGGCCACGTC 136 30 

TGTTTATCAA ATAAAATGTG TCTGACAGCT AGCTGATCAG TTGTAACCCA GGAAATAGTT 13 740 

GCCACTTCAT TTTTTAAAAT TTGTTTTAAC AACGACATAA GTTCATGCTC ACTTACACCA 13 800 

' 5 AATAAATCTT GAATTGCATC AATAATGGCA TATAGATTTA CCGATACAGC TAATGTTTGA 1386 0 

AAATAAGCAA AGAATGTTTC CAAATCCTCA TTAATTAGCG TATTAGGTGT ATCTTCTCTG 13 920 

ACGACATACT TCGGCAATGA AAGCTGATGT GCTGTTAGCC ATGGTTTATA AATTCTGACA 13 980 

20 

GTATCATGAT CACGTAACAC GCATTTTTGT ACACGTCCAT CTTCAAATGA CAA CAA TATA 14 040 

TTTTGACCAT GCAACTCTGG TAATGCGCCG TATTGCATAA ATGATAGTGT TACCTTTAAA 14100 

AAGACTTGCG CGATATCTTC AAATAACGTC ATGACATCAT TTTTAGAAAT ATTATCTTTT 14160 

25 

CCACAAATCA TTTGATATAA AGTGCGATCA TTTGCCGCGA GTGCTGCCAT TGACACTAGC 1422 0 

TGTTGCGTAT CATTTTTGGC TAGCACTTCG GGATACTTTC TTAGCTGAAC AGTTAGATGA 142 80 

30 CCTAATTGAT CTTTGAAAAT ATCATTATCT TGACCCATAT ATGACCACCA AG CTGTTT CA 143 4 0 

TCACAAACCA TGACATACTT AGCTAGTGCT TCATCTTTTT CTATAAGCTG ACGTAATAAT 144 00 

TGTTCTGCTT GTTCTCCGTT TTTCATGTAA CGCGTAGGCG TTAGCCTTAA TGCGCCTAAT 144 60 

35 GACTGCATTG CAAATGGTAC TTTGACATGG TTATACGGTG CGCCAATATC AATTAATGAA 14520 

CGCATACTTG AAGACGACAG ATAATCTCCA AATTTTAACG GTAATAGTAC AACCAACTTT 14 5 80 

TCACTAATCT CTTTCGCAAA GACGTTCGGC AGAATATGCT GATATTGCCA AGGATGTACC 14640 

40 

GGAAATAGTA CATAGTCATC TATTGATAAC CCTTGATCAT TTAACATGTC TGTCGCTTGT 14 700 

TCTTTTATAG GTACTGTCAA ATTTTCTAAT TCATCGATAT TTGCAGTATC GCCATGAATC 14 760 

45 ATATGTGTCT TTTTAACTGC TG CAA CCATT AAAGGAAATG ATTGATTTAA TTCAG CTTGA 14 820 

TACACTTGAT AATCCGCTTC TCTTAATCCT CTTTTTTCTT TAG CT AATGG ATGAAATGGA 14 8 80 

CGATCTTTTA AACTTGCAAA CTGCTCTGAC ATCACAAAAG GATGTGACGC TAAATCTAAT 14 940 

50 TCTGATAATT GTTTAGCAAG CTGTGTGGCA GCAGTAGTCA GTCCTTCTTC AACGCGAGCC 15000 

ACTTCCCATT CATGACTTAG ATCACAATTC ATATTAGCAA TTGTTTGCCA AAATTCAGCT 1506 0 

GCCGTTAAAG GTTGCTTAGA CACCCTTCCC TCTATCGTAA TTGGTTGTGA ACTTTCGTAA 15120 
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TATATCAAAA GCGTTTGTCC GTTTTCTTTA 
ATATCTTCAA ATAATAATGC ATCAACTAAA 

5 

ACTGCTGTAT GATTCTGCAA TGTTCAGACA 
TCCCAATATT TTGTTGTTGT GCCTGTTGAT 
TAGCCATACC CAT CGG ATT A AGTAATATGA 
CACCTGTCAC AAGTTGTCCT AGTTCAG CAT 
ACACCAATTG GTTAATAGTT TTCTTTTCTC 
is CTTTGTCAGC TTTAATAAAG ACTTCTTTAT 

ATGCACCCTT TTGTAACCAA TCATATTCAA 
TGACTACTTC ACCATTTGAT ACTGCTTCTT 

20 

CCGGACGCTG TTGTTGCCAT CTATCAACAA 
AAACAAACAC GCGTTCAATA TGATCGAATT 
CGATTAGCCC GCATCCAATG ATTGTTAAGT 

25 

CTGCAATCAC TGAAACTGCT GCAGTACGCA 
CAATTGGATA ATTCGTTTCT GGATCATTCA 

30 TACGTTTCGA TGGATTGTCG TGCTTACTAC 

CACCACCGAT ATGACTTGGC ATTGCAATAA 
CC tGTCTTAA ATACGGCTTA AGCGGTTGTA 

35 CTTCTGTTAA TGCGTCCACA TAAACTTGTG 

ATCTATTTAA ATACAACATC TCTCTatTCa 
TTTTTCTAAC CATGTATCTG AATAAACTAA 

40 

AATCGTGACA ATTGTTGCAC CTTCTTCAAT 
AATCGAACCT GTTGAAcCTC CGGCAAATAT 
CAAAGCAGAT TG AT AAT CAT CTACATGGAT 

45 

TTCGGGTACA CGACTAGCAC CGAT AC CAGG 
AATGACTGAC CCTTTCGCAT CAACAGCAAC 
50 TTTTCTACTC ATACCCATAA TGCTACCTGT 
AGGTTGCTTA ATTGTTTCAA CAATCTCTGT 
TAACTCATTC GCATATTGAT TAATCCAATA 

55 



GTAATCTCAC TATTCGATAC AATTCCGGCT 15240 

TCTCTTAATA TTATCGCTTG TGCTGTATTG 15300 

CCTCGCATTC TTAATATAGG TTCAATGTTG 15360 

AAATAAAA7A AGCACTTGAA ATATCTTCGA 15420 

TCTCATCA7C GTCTTCACGT CCTGGTATGT 15480 

GAAGAGCTTC TTTGCTGAAT TTACCTTCTA 15540 

GATTACATTG TGACCAGTCA TCT ACTA CG A 15600 

GCACATCCAT GATAGAAATG TTGCTAATAA 15660 

TGTATGGTTG ATCCGTTACG GTACATGTAA 15720 

TAGCATTTTC TGTCGCAATA AAATTAATTT 157 80 

AGCGTGCACA TGCTTCAGAG AATTGATCGT 15840 

GCTCTAACAT ACTTTGTAAT TGCTTGTCTC 15900 

CTTTAAAT CC TTTTTTAGCC AAATGCTTTG 15960 

TACT ACT AAT TAAACTTGCT TCCATAACTG 16 020 

AAATAATGAC GCCACTTGCA CGCTCCATAT 16080 

CTATCCACTT AATACCTGAA ATTGCGTGTT 1614 0 

TTCGATCTGC GATGTGTCCA TTTTCAGGAT 162 00 

CAAAATCATT GTGCGCATGG GCTGTTAATG 16 260 

AATGATTACC TCCCGCTTGT TCAATATCTG 16 320 

TTCTGaTTTA ACTCCTTGTC TTGATTTCAT 163 80 

ATCTAAGTAA CGATCGCCTC GATCTGGTAA 16440 

TGACGTTATC AACTGCTCAA TCGCTGCAAT 16 500 

GCCTTCATAA TCAATCAGTT TTCGACAGCC 16560 

CACTTGATTA ATTTCTGATC TATTCAATAT 16620 

TAATTCTCTA TTAATAGGTT TGTCACCAAA 16680 

AATTTGTGCG TTTGGATGCA CTTCTTTTAT 16740 

CGTGCTGACT GGCGCGACAA AATAATCTAT 16800 

GCCTGCACCA TGATAATGGG ATTGCCAATT 16860 

TGCATCGTCA ATAGTGGCTA ACAGTTCTTG 16920 
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TACATTGGCA CCATAACTTT TAATAATTTT CAAATTTGTT GGTCATATTT TAGGATCAAC 1704 0 

AACACACGTG AGTTTTAATC CCTTGATTTT AGCTATCATT GCCAACGCAA TGCCTAAATT 17100 

5 

ACCAGAAGTA CTTTCAATTA AATGTGTATT CTCAGTGATT AAACCATGTT TAATACCATG 17160 

TTCAATGATG TACTTGGCAG GTCGATCTTT CATGCTGCCT CCAGGATTCA TATACTCTAA 172 2 0 

CTTTGCAAAC ACTTCATGTT TCGGAAATAG TTGATGAAGT TGAAGCATAG GTGTTTGCCC 17280 

w 

TACAGAATCT AACAATGAAT CGTGCACATG 17310 
(2) INFORMATION FOR SEQ ID NO: 24: 

'5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5423 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

ATACTAGTAA GCGCATCGGT TATTGACATC GAATTCAACT TTAACAGTTT TCATGTTCGG 60 

TGATGTTTCa ATAGAATGTG TGTGTTGTAC TTGCGCATTT ATATTTCCAC CTAAATTACT 120 

TAAGTTTCCT GTAATACTAG AAATGTCAGG TGCGTTTAAT GTAGGTTGAA ATGCATCAAC 180 

TACTTTATCT GCAACATTAG AAACATTACG GATAACTTTA CTTGAATGAT TATCTATACC 24 0 

TTTAACGAAA CCTAACATTG AATACAT A CC AA CAT C CATG AATTCACGTG AAGGTGAGTG 300 

AATACCTAGC GCTCTTTTGG CTGCATTTAA AGCACCTTTT GCTACACTAG CTGCTTTTTC 360 

AGCTAAGTCT CTAGCCATAT TACCAATACC TCTCATCAAA CCACGGATCA TATCAGCACC 42 0 

TGCTGATACA AAGTCATCCA CAAAGCTTTT AACTTTATTT ACTGCATTTG TCATACCTTG 48 0 

ACTAACTTTG TTTACAACAT TAACGAATCC TTGAACAACT CTATTAACAA rGTTAATTAG 54 0 

CGTACtTGTt ATAGTAGATA CCCaTnGCAT ACCTTTAGTG ACmATGAAGT TCCAAGCTTG 60 0 

AGACATTTTG TCTGATATAG TTGAAACAAC TTGTGTGAAT ATGCTTACAA CTTTATTCCA 66 0 

AATTGTCGTT AATATACCAG ATAAGAAACT CCAAATCGTA TTCCATATAT TAGAAATAAA 72 0 

ACTCCATGCC GCTTGTAACG CAGTAGATAT AGCTGTAGTG ATAGCGTTCC AAACCTTAGT 78 0 

TGCCACAGTA ACTATAGTGT TCCACAACGT TTGTAAGAAC GTC CAAATAG CGTTCCAAAT 84 0 

TGTTATTGCG ATAGTCATAA TTGTGGTAAA CACTGTAGTT ATTACAGTGA CTAACAAATT 900 

CCAAATCGTA GTAGCGATTG TAATTATCGT ATTCCAGATT GTACTTAAGA ACGTCCAAAT 96 0 

AGCTGTCCAT ATCGTCATAA CTATTGTCAT TATCGTCGTG AAAACAGTTG TAATGATTGT 102 0 
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ATAACCCACT ATTTGATTCC AAACAATCAT TATAAAATTG TAAACATTCG ATACTGCTGT 114 0 

AGTGATAGCT GTTAAAATAG CATTCCATAC AACCGAAGCT ACAGCTTTTA ATACATTCCA 12 00 

5 

AACATTAACC ATAAACGTTT TTATCGCATT CCAAGCATTT ATAATAAAGT TTCTGAATCC 12 60 

TTCATTTTTA TTCCACAATA AAACGAATAT AGCTATTAAT GCAGCAATTA CACCAATTAC 13 2 0 

TATTGTTATT GGACCGCCTA AAATACCAAA CACAGTTACT AGTCCTGTGA TAGCATTTCT 13 80 

AATTAATCCA ATCTTACCGA ATAACAATTG GAATATAACT GATATAATTT TTAATGGTCC 14 4 0 

TTTTAATAAC ATGAACGCAC CTTTTAAAAT TGTTAATCCC GCTCTTAATA AACCGAACTT 1500 

is ACTTACTAAT GCAATGrTTC TACCTATTAA TCCGCCACCC ATAAAGTTAG ATACAGCAAG 156 0 

AATAATCGGT ATTAAAAATC TAAATGCACC AACTAAAGTT ATAATGACAC CAACTAATTG 162 0 

TGCTGTAGCT GGATGCGCCT CAAACAAGTT AGCTATCCAA CCAGTTATTG CAACTGCAAC 16 8 0 

20 

GCGTAATACT GCACTAGCTA TAGGAG CCAT TGCTGTTGCG AATGCArmTA ATCCTCTTGC 174 0 

GATGTTTCCA ATCAATTGCA TTATTAGTGG TCCATTTGTT TGTATATAAC TGACAAAGTC 180 0 

TTTAAACCCT TGAGATTGTC CTACTTGTTC AGACCATTCC CTAAACTTAG CTGTCATTTG 186 0 

25 

TTCAAGAGAT TGGAATATGC CAGTTGATGA TCCGCTGAAT GCATTCATCA AATTGTTAAT 192 0 

TCCAACGAAA ACATTTTTGA AAAT ATT AC C AATGATAGGT AAGTTTGTTT TTGTGTATTC 1980 

30 AATAAAACGA GTTATCGAAT TTTCTCCAGC TGCACTATTA GCCCAGTTAG AGAAAGATTG 2 04 0 

ACCTAATCTA TCCAACCAAT CAGCCGACCA TTGAAACAGT GGTGCTAATT GCGTGAATAC 210 0 

ATTGACTAAT CCGTCACCAA AACCACCTGC AGCACTTAAT AGCTTGTTAA ATACCGAAAC 2160 

^ 5 ACCCGTTGTA TTCATCATAT TAAAGAATCT TGAAGCTACA CTGCTATTTT CAGCCCATTT 22 2 0 

AAGCACGCTT TGAGACGCTT CTTCCATTCC TCTTGAAATA CCACTAAAAA ACGGTTGTAA 22 8 0 

GCTCTGCATT GCAGTTTTAA CAGTATTTAA ACCATTTGCA AGAGTTGTGA AGATAGCGGA 2 34 0 

40 

TTGATTTTGC TTTATAATAT CAGTCCATGC TGACTTTACG CCATCTAACG CTTTTTTGTA 24 00 

TTCGTTTGTT GCTGAGCTAG CTTGTAAAGT GCCATCATTA AGCATCTTTA TAGCGCTGAT 24 6 0 

AGCCATTGCG CCAAACGCTA CAAATCCTGC TCCCGCTATT GCTACGG CAC CACCTAAAGC 252 0 

45 

AAGTACACCA CCAGTTAACA CTTTGATAGC GTTTAATAGC GCAAATACTA CAGGTACTAC 258 0 

GCTCGCTATT ACAGGTATTA AGATACTAAA AGATGATGTA AGTAATCCAC CAACCATATT 2 64 0 

50 AGAACCTACA GTACCGAACA CACGGAACAT ATT AG CT AAA TTCCCCATCT GTCTTTGAAA 270 0 

ATTGTCATTT GCTTTTATTA TGTAGGCATA AGCTTTCTTT AAACCATTAG TATCGACATC 276 0 

TACCTTTGTT GTTTTTTTGT TCGGCAATGC GTCTAATGAT TTTTTAAACG CAT AAATAG T 282 0 

55 
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AAGTTCTTCT TTAGTACGTT TGATTTTAGA GTTAGCAACA CCATTGTCCA CGTCTATAAT 2 94C 

AGCTTTGGCT TTAGACCTAT TTAATGCTTC GAGACTAGCT TTAGATACTT TTAACACTCG 3 0 00 

ATTGAATTTA CTGTTATCTG CATTGACGTC AATATTGACA CGTTTCTTTT CTAATTCTGA 3 060 

TAATTTAGCT TCTGTTTCAG CGATATCTTT AATCAACTTT TGTTTTTGCA ACTTAACTTC 3120 

TGGTGTAACT TCTTTAGAGT TTAGTTTGTC TAGTTCAAAA TTCGATTCTA GTACCTTTTG 3180 

w 

TTGTAAATCT TGTATACTAG CATCTAATTT AGCTTTTACA TTTTTGTTAC TAAAGGCATC 3 24 0 

TAAAGACTTT TTAGCAACTT TGATAGTTTT TTGTAAATTT TTATCGTTAG CGTTTAATTC 3 3 00 

15 AACATCTTTA GTTTGATCTG CTACTCGTTT AAATCTTTGC ACAGACTTAA CCGCACTATC 33 6 0 

AATTTGCCTT TTGAATTTGG CTACACTAGC TTCAATAGTC GCTTTAATTT TATATTCCGT 34 2 0 

CACATTAACA CCTCTCTTTC TATTGCTTAT TAAATTCTGC TATAACTTTA AAGAATTCAT 34 8 0 

20 TATTTTGTGG TTCGTATTCA TCACGTTCGC TACTAAATCT TATATCTTTA CCTTCGTTAA 354 0 

GCCGTTGGAT ATTTTCTTCA TAAGGCAATA CGTCGTTTGC ATTGTTAAAA ACATATTCCT 36 0 0 

CTTTAGGTTT ATTTTCTGTC CCAACATTTT TAGTAGCTGC AGCATCACGA ATAGCAAACG 36 6 0 

25 

CAAGTTTGTA ACGTTCGAAT TCTTGGGTTA GCATTTCATA CTCTTTCGCA TACATTCGAT 3 72 0 

AGTTATATTC TGTTAATGTC ATTTGCTCAA TAACGTTCAA ATCTGTAATA C CAAG TGTTG 37 8 0 

ACATACAAGT TATAACGATT CTGTCGTAAG TTATTAGGcT TCCGCTGGTT TTTCTTCCGT 3 84 0 

30 

TTCCACTACT TCGACTAGGT TTCGGGTCAT AGGTCGCTTT CCCAAcTCCG TTAAAATATC 3 900 

CGAACCGAAT TCTTCTAGTC CGATATTTTC TGCGATTTCA TCTAATGCTT CAT CAATGTT 3 96 0 

35 ATTAATAGTA ATTGCTTGTT TTTTTAAGTG AGATGTAGCT GCGATTAAAA cTTCGCCAAT 4 02 0 

CACAACCGGA TTTCCACTTT CTAAACCTAC AGGCAACATT GATACACCTT GACCGATAGA 4 08 0 

AGCTTGTTCA ACTTTTAAAC CTAATCGGTT ATCGATTTCT CTTAAAAATT TAAAACCAAA 414 0 

ACTTAATTCT AATGACTTTC CGTTAATTTC TACATTCATA ACTTAAAATC TCCATTCATA 4200 

ATTAATTTAA ACAAAATAAA mArGCTTAAC GCCCTATTTT TATACCTCTC TTGGTGCAAC 4 26 0 

CGGTGGTGAA TCTACTTTAG GTTGTGGAAT TGCTGTTAAA TCTTCGCCAG TTAATGCATC 4 32 0 

TGCTTTTGTA GTGTCGTGGA ATCTGTATcC AGTCGCCTTA AGTTTCTTTG TTACAGCCTC 4 3 80 

AGGTAGTGTT GCAAATCCAC GTTGGAAACG ACCATTCACT C CAT ATT CAT ATTCATATTC 444 0 

ATCAATACCG TTAGCTTCTG CTTTTAATTC AAATTTATTG TGGAAACCTT GGAAATATTT 4 50 0 

CGCTTTAAAT TTAGCGGAAT CCCCATTTTT GCCTGGTATT CTACTTTCAA CTTCCCAAGC 4 56 0 

TTCATACAAT ACGCGATCTA CAACTGCATC TTCAATTTCA TCTGCAAAAT CGTCACCATA 4 62 0 
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GTCCATTGTA TCCTCTGTAT CTGTATCAGC TTCATGTGAT AAGCCGTATT CAGTTAAAAA 4 74 0 

AAGCATTTTA GTAGCATCTA CTTTTT CGCC AGCTTTTCTA AATAAAATAA TACGATCATT 4 8 00 

ACTATTTTTC ATATTTGCCA TTCAATATTC CTCCGTTTTT TAAAATGTTT TGTAAGATAT 4 860 

CGTTACTGAT GTGTGTAGCA ATTCTTGATT GG TAG TAT CA TCAACTAACT GTGTGATGTT 4920 

AG7ATCTTCT TCTTCAAAGT CATAATCGTT TGTTTTAACG CTAGGTGTTA AATCATCAAT 4 980 

10 

ACATCTTTTA ACAAGTCCGT CATGATGTCC TAAATCATCG CTTACACTCC AAATATCAAT 5 04 0 

AACTAAATTC GTATCGCCAG AATAACTATC AAACGTGTAC TTACTTCTAT TTGACTCCGG 5100 

is CATTTTTATT ACAAAAAAAG GATACGGAAT CTCTTGTTGC ATCTCTTTAC GAGAAATAAC 5160 

AGGGAATCCA TATCCTTGTA GCGTTTCATA CGCTTTATTA TAAAGTTGTA AGTTCGGTGT 522 0 

CATGCTTTTA TCTCCTATTC AAACAACGCT TTCAATTCTT CTACAGTTGA TTTCCTAATC 52 8 0 

20 ACTTCGTATA CCGGCCACAT AAAAGGTTCA GCCTCCATGT ATCGAGTACC AAATTCTAAG 534 0 

AAACCACTAT AAGCTGCGTG CGATGTGATA GTGTATTGCA AATCGCCAGT TTTTTTATAT 54 0 0 

CTGATATTGC GTGATaAATT ACC 54 2 3 

25 

(2) INFORMATION FOR SEQ ID NO: 25: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 6251 base pairs 

(B) TYPE: nucleic acid 

30 

(C) STRANDEDNESS: double 
CD) TOPOLOGY: linear 



35 <xi> SEQUENCE DESCRIPTION: ! 

AAA CG CAG AT GTT CAATT AG AACCAGTCTA 
AATACGAGAC CAAATTAGAC AAGCGTTAAA 

40 

TGAACTAAGA GAAAAATATA AATTAGAGAC 
TCCTAAAAGT AAAGAGGATT TATTACGTGC 
TTTATTCGAA TTACGTATGC AATGGCTAAA 

45 

TGAAATTGAT TATGACATAG ACCAAGTTAA 
AACTGAAGCA CAGAAATCCA GTGTTAATGA 
so TATGCATCGA TTACTTCAAG GTGATGTAGG 
TATGTATGCG TTAAAAACTG CTGGTTATCA 
AGCAGAGCAA CATGCTGAAA GTTTAATGGC 

55 



lEQ ID NO: 25 : 

TCGTATTAAG GAAGGTATTA AACAAAAGCA 6 0 

TGATGTGACA ATT CATGAAT GGTTAACTGA 12 0 

CTTGGACTTT ACTTTGAACA C ATT AC AT C A 180 

TCGTAGAACC TATGCATTTA CTGAACTGTT 24 0 

TAGATTAGAA AAGTCATCTG ACGAAG CAAT 3 00 

ATCATTTATT GATCGTTTAC CTTTTGAACT 360 

AATTTTTAGA GATTTAAAAG CACCAATACG 4 20 

TTCAGGAAAA ACAGTAGTTG CTGCAATTTG 4 80 

ATCAGCATTG ATGGTAC CAA CTGAAATTTT 54 0 

TTTATTTGGA GATTCTATGA ACGTTGCATT 6 00 
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TACGATTGAT TGTTTAATTG GAACCCATGC TTTGATTCAA GATGATGTGA TTTTCCATAA 7 20 

TGTTGGTTTA GTAATTACAG ATGAACAACA TCGATTTGGT GTGAATCAAC GCCAGCTTTT 7 80 

5 

AAGAGAAAAA GGTGCAATGA CGAATGTGTT ATTTATGACA GCAACGCCGA TACCAAGAAC 84 0 

ACTAGCAATA TCAGTTTTTG GTGAGATGGA TGTGTCTTCA ATTAAACAAT TACCAAAAGG 90 0 

TCGTAAACCT ATCATTACTA CTTGGGCAAA GCATGAGCAA TACGATAAAG TTTTGATGCA 96 0 

w 

AATGACCTCA GAGTTGAAAA AAGGTCGTCA AGCATATGTC ATTTGCCCGC TAATAGAAAG 102 0 

TTCTGAGCAT CTCGAAGATG TTCAAAATGT TGTCGCATTG TACGAGTCTT TACAACAGTA 108 0 

is TTATGGTGTT TCCCGTGTAG GGTTATTGCA TGGTAAGTTA TCTGCCGATG AAAAAGATGA 114 0 

GGTCATGCAA AAGTTTAGTA ATCATGAGAT AAATGTTTTA GTTTCTACTA CTGTTGTTGA 12 0 0 

AGTAGGTGTT AATGTACCGA ATGCAACTTT TATGATGATT TATGATGCGG ATCGCTTTGG 12 60 

20 ATTATCAACT TTACATCAGT TACGCGGTCG TGTAGGTAGA AGTGACCAGC AAAGTTACTG 13 20 

TGTTTTAATT GCATCCCCTA AAACAGAAAC AGGAATTGAA AGAATGACAA TTATGACACA 13 80 

AACAACGGAT GGATTTGAAT TGAGTGAACG AGACTTAGAA ATGCGTGGTC CTGGAGATTT 14 4 0 

25 

CTTTGGTGTT AAACAAAGTG GaTTGCCAGA TTTCTTAGTT GCCAATTTAG TTGAAGATTA 1500 

T CG TATG IT A GAAGTTGCTC GTGATGAAGC AGCTGAACTT ATTCAATCTG GCGTATTCTT 156 0 

3Q TGAAAATACG TATCAACATT TACGTCATTT TGTTGAAGAA AATTTATTAC AT CGT AGTTT 16 2 0 

TGACTAATTG CCATGCTGAT TTGTCAATTT GAGTGCAACa CTTCGTTAAT TGAGTGATAT 16 8 0 

GACACTTGAA CTATTTAAAT GTAAAGTGGT ATTTTAACAA TTTATAAATT TTCGACTAAA 174 0 

35 TAATAGCTAA ATATTACAGT TATTTGTTGA GTCGGTTAAA TAGAAAGTGT TATGATATGT 1800 

GAGGAATGTT TAAGACTAGG TACTAAAAAA TGAGGGGTGA GACGTTGAAA CTAAAGAAAG 186 0 

ATAAACGTAG AGAAGCAATC AGACAACAAA TTGATAGCAA TCCCTTCATC ACAGACCATG 192 0 

40 

AACTAAGCGA CTTATTTCAA GTGAGTATAC AAACAATTCG TTtAGaTCGC ACTTATTTAA 198 0 

ACATACCAGA ATTAAGGAAG CGTATTAAAT TAGTTGCTGA AAAGAATTAT GACCAAATAA 204 0 

GTTCTATTGA AGAACAAGAA TTTATTGGTG ATTTGATTCA AGTCAATCCa AATGTTAAAG 2100 

45 

CGCAATCAAT TTTAGATATT A CAT CGG ATT CTG7TTTTCA TAAAACTGGA ATTGCGCGTG 216 0 

GTCATGTGCT GTTTGCTCAG GCAAATTCGT TATGTGTTGC GCTAATTAAG CAACCAACAG 22 2 0 

so TTTTAACTCA TGAGAGTAGC ATTCAATTTA TTGAAAAAGT AAAATTAAAT GATACGGTAA 22 8 0 

GAGCAGAAGC ACGAGTTGTA AATCAAACTG CAAAACATTA TTACGTCGAA GTAAAGTCAT 234 0 

ATGTTAAACA TACATTAGTT TTCAAAGGAA ATTTTAAAAT GTTTTATGAT AAG CGAGG AT 24 0 0 

55 
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TTAGAAGCCG TACAAAAGGC TGTTGAAGAC TTTAAAGATC TAGAAATTAT ACTTTTCGGT 2 52 0 

GACGAAAAAA AGTATAATCT GAACCATGAA CGAATCGAAT TTAGACATTG TTCTGAAAAG 2 580 

5 

ATTGAAATGG AAGATGAGCC TGTTAGAGCG ATTAAACGTA AAAAAGATAG CTCAATGGTA 2 64 0 

AAAATGGCTG AAGCTGTGAA ATCTGGTGAA GCAGATGGAT GTGTGTCAGC AGGTAATACT 270 0 

GGTGCTTTAA TGTCAGCTGG TTTATTCATT GTTGGACGTA TTAAAGGTGT AGCTAGACCG 2760 

10 

G CTTT AG TAG TAACATTGCC AACGATTGAT GGAAAAGGTT TTGTCTTTTT AGACGTTGGT 2 820 

GCAAATGCTG ATGCTAAACC TGAACACTTA TTACAGTATG CGCAACTAGG GGATATTTAT 2 880 

is GCTCAAAAAA TTAGAGGTAT TGATAATCCG AAAATCTCAT TATTAAATAT AGGAACCGAG 2 940 

CCAGCTAAAG GTAATAGTTT AACGAAAAAA TCATATGAGT TATTAAATCA TG AT CATTCA 3 000 

TTGAATTTTG TTGGGAATAT TGAAGCGAAG ACATTAATGG ATGGCGATAC AGATGTTGTA 3 06 0 

20 GTTACCGATG GCTATACTGG GAACATGGTC CTTAAAAATT TAGAAGGTAC TGCAAAATCA 3120 

ATCGGTAAAA TGTTAAAAGA TACGATTATG AGTAGTACTA AAAATAAATT AGCAGGTGCA 3180 

ATATTGAAGA AAGATTTAGC TGAATTCGCT AAAAAGATGG ATTACTCAGA ATACGGTGGT 324 0 

25 

TCCGTATTAT TAGGATTGGA AGGTACTGTA GTTAAAGCAC ACGGTAGTTC AAATG CT AAA 3 300 

G CTTT TT ATT CTGCAATTAG ACAAGCGAAA ATCGCAGGAG AACAAAATAT TGTACAAACA 3 360 

ATGAAAGAGA CTGTAGGTGA AtCAAATGaG TaAAACAGCA ATTATTTTTC CGGGACAAGG 3 4 20 

30 

TGCCCAAAAA GTTGGTATGG CGCAAGATTT GTTTAACAAC AATGATCAAG CAACTGAAAT 34 8 0 

TTTAACTTCA GCAGCGAACA CATTAGACTT TGATATTTTA GAGACAATGT TTACTGATGA 3 54 0 

35 AGAAGGTAAA TTGGGTGAAA CTGAAAACAC ACAACCAGCT TTaTTGaCGC aTAGTTCGGC 3600 

ATTATTAGCA GCG CTAAAAA ATTTGAATCC TGATTTTACT ATGGGGCATA GTTTAGGTGA 366 0 

ATATTCAAGT TTAGTTGCAG CTGACGTATT ATCATTTGAA GATGCAGTTA AAATTGTTAG 3720 

40 

AAAACGTGGT CAATTAATGG CGCAAGCATT TCCTACTGGT GTAGGAAGCA TGGCTGCAGT 378 0 

ATTGGGATTA GATTTTGATA AAGTCGATGA AATTTGTAAG TCATTATCAT CTGATGACAA 3 84 0 

AATAATTGAA CCAGCAAACA TTAATTGCCC AGGTCAAATT GTTGTTTCAG GTCACAAAGC 3 90 0 

45 

TTTAATTGAT GAGCTAGTAG AAAAAGGTAA ATCATTAGGT GCAAAACGTG TCATGCCTTT 3 96 0 

AGCAGTATCT GGACCATTCC ATTCATCGCT AATGAAAGTG ATTGAAGAAG ATTTTTCAAG 4 02 0 

so TTACATTAAT CAATTTGAAT GGCGTGATGC TAAGTTTCCT GTAGTTCAAA ATGTAAATGC 4 08 0 

GCAAGGTGAA ACTGACAAAG AAGTAATTAA ATCTAATATG GTCAAGCAAT TATATTCACC 414 0 

AGTACAATTC ATTAACTCAA CAGAATGGCT AATAGACCAA GGTGTTGATC ATTTTATTGA 4 200 

55 
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AACATCAATT CAAACTTTAG AAGATGTGAA 
TAGTAACAGG TGCATCAAGA GGAATTGGAC 
5 GATATAATGT AGCAGTAAAC TATGCAGGCA 

AAATCAAAGC TAAAGGTGTT GACAGTTTTG 
AAGTTAAAGC AA TG ATT AAA GAAGTAGTTA 

w 

ATAATGCAGG TATTACTCGC GATAATTTAT 
ATGTTATTGA CACAAACTTA AAAGGTGTAT 

;5 TGTTAAGACA ACGTAGTGGT GCTATCATCA 
ATCCGGGACA AGCAAACTAT GTTGCAACAA 
CGGCGCGTGA ATT AG CAT CT CGTGGTATCA 

20 TTTCTGATAT GACAGATGCT TTAAGTGATG 
CGTTAGCACG TTTTGGTCAA GACACAGATA 
ACAAAGCAAA ATATATTACA GGTCAAACAA 

25 

ATATTTGAGC TAAAGCTCAT TGACGCAGTG 
GACCTAGTCA ACTTTGCGGG GGAAATTCTA 
CCTAAGAAAC ACTAATCAAT aAATTGwTAA 

30 

AATTTAAAAT GGGAAAATAT AGTAGTCTAT 
CGTGGAAAAT TTCGATAAAG TAAAAGATAT 

35 TAAAGTAACT GAAGATGCAT CTTTCAAAGA 
TGAATTAGTA ATGGAATTAG AAGACGAGTT 
AAAA^TCAAC ACTGTTGGTG ATGCTGTTAA 

40 CTTACATCTG GGTCGTCAGT ATTGTCGACT 
AACGTAAAAT TAAAGATGAT TCAAGAGCAA 
AAAAGAAAAG TGAGATAGTT AATCGTTTTA 
TAGGCTTTAC TTATCAAAAT ATTGATTTAT 
TTAATGATTT TAATATGAAT CGTTTAGACC 

50 CGGTATTAGA ATTGACGGTT TCACGATATT 
GGAATTTAAC AAAAATGCGT GCCaCTATTG 
ATAAAATTGG ATTGAACGAA ATGATTTTAC 

55 



AGGATGGAAT GAAAATGACT AAGAGTGCTT 4 3 20 

GTAGTATTGC GTTACAATTA GCAGAAGAAG 4 3 30 

GCAAAGAGAA AGCTGAAGcA GTAGTCGAAG 4 44 0 

CGATTCAAGC AAATGTTGCC GATGCTGATG 4 50 0 

GCCAATTTGG TTCTTTAGAT GTTTTAGTAA 4 56 0 

TAATGCGTAT GAAAGAACAA GAGTGGGATG 4 620 

TTAACTGTAT CCAAAAAGCA ACACCACAAA 46 80 

ATTTATCAAG TG TTGTTGGA GCAGTAGGTA 4 74 0 

AAGCAGGTGT TATTGGTTTA ACTAAATCTG 4 800 

CTGTAAATGC AGTTGCACCT GGTTTTATTG 4 860 

AGCTTAAAGA ACAAATGTTG ACTCAAATTC 4 92 0 

TTGCTAATAC AGTAGCGTTC TTAGCATCAG 4 980 

TCCATGTAAA TGGTGGAATG TACATGTAAT 5 04 0 

GTTGACTGGT CATCCAATGG AGAATTGTCT 5100 

AGCAACCTAG ATAAGGTTCC AGAATTTCTC 5160 

GTGTTTCTAA AATTTCTACT TGTTTTTTAG 5 22 0 

GTATAGGCAT TTTTAAAGGA GGTGAATCGA 52 8 0 

CATCGTTGAC CgTTTAGGTG TAGACGCTGA 5 34 0 

TGATTTAGGC GCTGACTCAC TTGATATCGC 54 0 0 

TGGTACTGAA ATTCCTGATG AAGAnGCTGA 54 6 0 

ATTTATTAAC AGTCTTGAAA AATAATAAAT 5 52 0 

CAGTTTTTTT CTTTAATTAT CAATAGTTTT 5 5 30 

CACATAAAGG AGATAAAATA ATGTCTAAAC 5 64 0 

GAAAGCGCTT TGATACTAAA ATGACAGAGT 57 0 0 

ACCAACAAGC ATTTTCGCAT TCGAGTTTTA 5760 

ATAATGAGCG TTTAGAGTTT TTGGGTGATG 5 82 0 

TATTTGATAa ACATCCCAAC TTGCCAGAAG 58 8 0 

TATGTGAGCC CtCACTkGTA ATATTTGCGA 5 94 0 

TTGGTAAAGG TGAAGAGAAA ACAGGGGGAC 6 000 
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ATCAAGGACT AGATATAGTT TGGAAATTTG CTGAGAAAGT CATTTTCCCA CATGTAGAAC 612 0 

AAAATGAGTT ATTAGGCGTG GTAGATTTTA AAACACAATT CCAAGAATAT GTGCACCAGC 6180 

AAAATAAAGG TGATGTAACC TATAATTTAA TAAAAGAAGA GGGACCGGCA CATCATCGTC 6 24 0 

TATTCACTTC A 6 2 51 



(2) INFORMATION FOR SEQ ID NO: 26: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4920 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
is (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 26: 
20 ACCTACTGAA GTTGCTAATT TTTTGGAGCA ACTAAGCACT GAAATTGAAC GTCTTAAAGA 60 

AGATAAAAAA CAACTTGAAA AAGTAATCGA AGAGAGaGAT ACTAATATTA AGTCTTATCA 120 

AGACGTGgCA TCAATCTGTA AGTGaTGCTT TGATACAAGC TCAAAAAGCT GGTGAAGAAA 180 

25 

CTAAGCAAGC TG CAG AG AAA CAAGCTGAAG CGATTATAGC TAAGGCAGAA GCGCAAgcTA 24 0 

ATcAAATGGT TGGTGACGCG GTAGAAAAAG CACGCCGTTT AGCATTCCAG ACTGAAGATA 3 00 

3Q TGAAACGTCA AT CAAAAGT A TTTAGATCGC GTTTCCGTAT GTTAGTTGAA GCGCAATTAG 3 60 

ACTTATTAAA AAACGAAGAT TGGGATTACT TGTTGAATTA TGATTTAGAC GCTGAACAAG 4 20 

TGACGCTTGA AAA T ATT CAT CATTTGCATG AAAATGATTT AAAGCCAGAT GAAGTTGCAG 4 80 

3$ CAAATGCACA AAATAATGCA TCAAATACAC CAGACAATAA TCAACAATCC AATGATTCAG 54 0 

AAACAACTAA GAAGTAAGAA TTAAATAAAG ACAGACGCGT AATATACATT TAACTTTTCA 6 00 

CAGCGAATTA GGTAATGGTG AGAGCCTAGT AAAAGCATGT ATGTTATATC ACTGGCTTTT 660 

40 

TAATATTTAA ATAATGTAAT GAGAGAACTC TAAGTTGAGT TAATAAGGGT GGTACCGCGA 720 

GCAATCGTCC CTTTTAATTT AA C TT AG AG T TTTTTAAATT TTTAAGGAGT GAAAAAAATG 7 80 

GATTACAAAG AAACGTTATT AATGCCTAAA ACAGATTTCC CAATGCGAGG TGGTTTACCA 84 0 

45 

AACAAGGAAC CGCAAATTCA AGAAAAATGG GATGCAGAAG ATCAATACCA TAAAGCGTTA 900 

GAAAAAAATA AAGGTAACGA AACATTCATT TTACATGATG GCCCACCATA CGCGAATGGT 960 

50 AACTTACATA TGGGACATGC CTTGAACAAA ATTTTAAAAG ACTTTATTGT ACGTTATAAA 1020 

ACTATGCAAG GGTTCTATGC ACCATACGTA CCAGGTTGGG ATACACATGG TTTACCAATT 1080 

GAACAAGCAT TAACGAAAAA AGGTGTTGAC CGAAAGAAAA TGTCAACAGC TGAATTCCGT 114 0 
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20 



TTAGGTGTTC GTGGTGACTT TAATGATCCA TATATTACAT TAAAACCTGA ATACGAAGCT 126 0 

GCACAAATTC GTATTTTTGG AGAAATGGCA GATAAAGGTT TAATTTATAA AGGTAAAAAG 1320 

CCAGTTTATT GGTCTCCTTC AAGTGAGTCT TCATTAGCAG AAGCAGAAAT TGAATATCAC 13 80 

GATAAACGTT CAGCATCAAT TTACGTTGCA TTTGACGTTA AAGATGACAA AGGTGTCGTT 14 4 0 

GATGCAGATG CTAAATTTAT TATCTGGACA ACAACGCCAT GGACAATTCC ATCAAATGTT 15 CO 

GCGATTACCG TTCATCCTGA ATTAAAATAT GGTCAATACA ATGTAAATGG cGAAAAATAT 156 0 

ATTATTGCAG AAGCCTTGTC TGACGCTGTA GCAGAAGCAC TGGaTTGGGA TAAAGCATCA 162 0 

ATCAAATTAG AAAAAGAATA CACAGGTAAA GAATTAGAGT ATGTTGTAGC ACAACATCCA 16 8 0 

TTCTTAGACA GAGAATCGTT AGTGATTAAT GGTGATCATG TTACTACAGA TGCTGGTACA 174 0 

GgTTGTGTAC ATACAGCACC AGGTCACGGG GAAGATGACT ATATTGTTGG TCAAAAATAT 180 0 

GAATTGCCAG TAATTAGTCC AATCGATGAT AAAGGTGTAT TTACTGAAGA AGGCGGCCAA 136 0 

TTTGAAGGGA TGTTCTATGA TAAAGCTAAT AAAGCCGTTA CTGATTTATT AACAGAAAAA 192 0 

GGTGCACTAT TAAAATTAGA CTTTATTACA CATAGCTATC CACACGACTG GAGAACAAAA 1980 

25 

AAACCTGTAA TCTTCCGTGC TACACCACAA TGGTTTGCCT CAATCAGTAA AGTAAGACAA 2 04 0 

GATATTTTAG ATGCAATCGA AAATACAAAC TTCAAAGTAA ATTGGGGTAA AACACGTATT 2100 

3Q TACAATATGG TTCGTGACCG TGGCGAATGG GTTATTTCTC GTCAACGTGT GTGGGGTGTA 216 0 

CCGTTACCAG TATTTTATGC TGAAAATGGC GAAATTATCA TGACGAAAGA AACAGTGAAT 222 0 

CATGTTGCTG ATTTATTTGC AGAACACGGT TCAAATATTT GGTTTGAAAG AGAAGCGAAA 22 8 0 

35 GACTTACTAC CAGAAGGATT TACACATCCA GGCAGCCCTA ACGGTACATT TACTAAAGAA 234 0 

ACAGACATTA TGGACGTTTG GTTTGATTCT GGTTCATCAC ACCGTGGCGT G TTGG AAA CA 24 0 0 

AGACCGGAAT TAAGTTTCCC AGCGGATATG TATTTAGAAG GTAGTGACCA ATATCGTGGT 24 6 0 

TGGTTCAACT CTTCTATCAC AACTTCAGTT GCTACAAGAG GAGTATCACC TTATAAATTC 2 52 0 

TTACTTTCTC ATGGTTTTGT TATGGACGGT GAAGGTAAGA AAATGAGTAA ATCTTTAGGT 2 580 

AATGTGATTG TACCTGACCA AGTGGTTAAA CAAAAAGGTG CTGATATTGC GAGACTTTGG 2 64 0 

GTAAGTAGTA CGGACTATTT AGCTGATGTT AGAATTTCTG ATGAAATTTT AAAACAAACA 2 7 00 

TCTGATGTTT ATCGTAAAAT CAGAAATACA TTAAGATTTA TGTTAGGTAA CATTAACGAT 2 760 

TTCAATCCTG ACACAGATAG CATTCCTGAA TCAGAGTTAT TAGAAGTGGA TCGTTACTTG 2 820 

CTAAAT CGTT TACGTGAATT TACTGCAAGT A CG ATTAACA ACTATGAAAA CTTTGACTAC 2 880 

TTAAATATTT ATCAAGAAGT TCAAAACTTT ATCAATGTTG AGTTAAGTAA TTTCTATTTG 2 94 0 
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CAAACAGTGT TATATCAAAT TTTAGTTGAT ATGACGAAGT TGTTAGCACC AAT CTTAGTG 3 06 0 

CATACAGCTG AAGAAGTTTG GTCTCATACA CCACATGTTA AAGAAGAAAG TGTTCACTTA 3120 

5 

GCAGACATGC CTAAAGTTGT AGAAGTAGAT CAAGCTTTAT TGGATAAATG GCGTACATTT 3180 

ATGAATTTAC GTGATGATGT GAACCGTGCA TTAGAAACTG CTCGTAATGA AAAAGTTATT 3 24 0 

w GG7AAATCAT TAGAAGCTAA AGTTACGATT GCTAGTAACG ATAAATTTAA TGCATCTGAA 3 3 00 

TTCTTAACTT CATTTGATGC ATTACATCAA TTATTTATCG TG7CACAAGT TAAAGTTGTA 3 3 60 

GA7AAGTTAG ACGATCAGGC AACAGCTTAT GAACATGGTG ATATTGTCAT CGAACATGCA 34 20 

15 GATGGTGAAA AATGTGAAAG ATGTTGGAAC TATTCAGAGG ATCTTGGTGC TGTTGATGAA 34 80 

TTGACGCATC TATGTCCACG ATGCCAACAA GTTGTAAAAT CACTTGTATA ATTGAAATTG 3 54 0 

TA7AAAGTAC TCATACAGAT GATATAAATT AAAGCTGTCT TCATAATCAT GTTGTAGTTT 3 6 00 

20 

TTGTTGACAT GATGAAGAGA GTTTTTTTGT GAATAAAAAA ATGACCAAGT TACCGGTCAT 3 6 60 

ATATGTAAAA AATGTGCGAT TTACTAAAAT AAAAATTATT CAGGAATGGT ACAAATTCTC 3 720 

TGAGGCATAT AAATGCGTTA TAGTTGCTAT TCTCAATTAT GTTCGCGATA ATTTTAAGTA 3 7 80 

25 

AAAGTAAGCA CAGATATTGA ATTTGATAGG AGTTAATTGA ATGTATCATA ACAGTAACGC 3 84 0 

AAACTTTGTC AATGGTATCA CTTTAAATGT GAGAGATAAG AATGAATTAA AG C CATTTT A 3 9 00 

30 TGAGGACATA TTAGGATTAA ATATTATAAA TGAGACATTA ACATCGATAC AATATGAAGT 3 96 0 

AGGTCAAAAT AATCATGTCA TTACACTTGT TGAATTACAA AATGGACGTG AACCTTTAAT 4 020 

GTCCGAAGCG GGACTGTTTC ATATCGCAAT TAAACTACCT CAAATTAGTG ATTTAGCTAA 4 0 80 

35 TTTACTAATT CATTTAAGCG AATATGATAT TCCAGTTAAC GGAGGTATAC AGCCTGCTTC 4140 

GTTATCATTA TTTTTTGAAG ACCCGGAAGG AAACGGTTTT AAATTTTATG TTGATAAAGA 4 2 00 

CGAAGCGCAA TGGACGAGGC AAAATAATTT AGTAAAAATT GATATTAGAC CATTAAATGT 4 2 60 

JO 

ACCGAGATTA GTGAGTCATG CAACAAAATT GTTATGGTTA GGTATTCCAG ATGACGCTAT 4 3 20 

TATAGGTGCA TTG CAT ATT A AGACAATTCA TTTATCAGAG GTAAAAGAGT ACTACCTCGA 4 3 80 

45 TTATTTTGGA TTAGAGCAAT CGGCATATAT GGATGATTAT TCAATATTTT TAGCATCGAA 44 4 0 

TGGCTATTAT CAACATTTGG CCATGAATGA TTGGGTATCA GCAACGAAAC GTGTAGAAAA 4 500 

TTTTGATACG TATGGATTAG CAATTGTTGA CTTTCATTAT CCTGAAACAA CACATTTAAA 4 56 0 

50 TTTACAAGGT CCGGATGGTA TCTATTATCG CTTTAATCAT ATCGAAGTTG AAGATTAGTA 4 620 

TATACTTTGA ATGGACGAAC CATATAATGA ATCGTTTTTA ATGATCTTTT TATACAAGTT 4 6 80 

ATGAAGGAGG CTGGGACATT AAGTTCTTAG GCAATGTAAA AAGCTGATTT CTATTAATTA 4 74 0 
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TTTTCCTTAT ATTAATTGCC ATTAATACAA AACCTAGCTC TCGTTTAACT TTATTTATTC 4 8 60 

CTCGAACTGA CATTCGnGTG AACTCAAAAT nGCCTACTTn CTTAAATTAC CAATATCTAT 4 92 0 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: double 
(D) TOPOLOGY: linear 



is (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TGGATTGCCA TTACATGGAC AAGATTTAAC TGAATCAATT ACACCATATG AAGGTGGTAT 6 0 

CGCTTTTGCA AGTAAACCAT TAATTGATGC TGATTTTATT GGTAAATCTG TATTAAAAGA 120 

20 TCAAAAAGAA AATGGTGCAC CAAGAAGAAC AGTGGGATTA GAATTACTTG AAAAAGGAAT 18 0 

TGCAAGAACT GGTTATGAAG TTATGGATTT AGATGGAAAT ATTATTGGAG AAGTAACTTC 24 0 

AGGAACACAG TCTCCATCAT CAGGAAAATC AATTGCACTT GCAATGATAA AAAGAGATGA 3 00 

25 

GTTTGAAATG GGTAGAGAGT TGCTTGTTCA AGTTCGTAAG CGTCAATTAA AAGCGAAAAT 36 0 

TGTTAAGAAA AATCAAATTG ATAAATAATT AAAAAGGGGT GTGCATTGTG AGTCATCGTT 42 0 

3Q ATATACCTTT AACTGAAAAA GACAAGCAAG AAATGTTACA AACAATTGGT GCAAAAT CTA 4 80 

TAGGAGAATT ATTCGGTGAT GTACCAAGTG ACATTTTATT AAATAGAGAT TTAAATATTG 54 0 

CTGAAGGCGA ACGGAGAACA ACGTTACTTA GAAGATTnAA TCGCATTGCA AGCAAGAGTA 60 0 

35 TCACTAGAGG AACGCGTACA TCGTTT 62 6 

(2) INFORMATION FOR SEQ ID NO: 28: 

J(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

nGGAAGTGGT GTATATATTT GTAATGAGTG TATTGAATTA TGCTCAGAAA TCGTCGAAGA 6 0 

50 AGAATTAGCT CAAAACACTT CTGAAGCGAT GACAGAATTA CCTACTCCTA AAGAAATTAT 12 0 

GGATCATTTA AACGAATATG TTATTGGTCA AGAAAAAGCT AAAAAATCTT TAGCTGTAGC 180 

TGTTTATAAC CACTATAAGC GTATTCAACA ATTAGGACCA AAAGAAGATG ATGTTGAATT 24 0 
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AACCTTA3CC AAGACGTTGA ATGTACCATT TGCAATTGCA GATGCGACAA GTTTAACTGA 36 0 

AGCTGGTTAT GTAGGCGATG ATGTTGAAAA TATCTTGTTG AGATTAATTC AAGCAGCTGA 4 20 

5 

CTTTGACATT GATAAAGCCG AAAAAGGTAT TATTTATGTA GATGAAATTG ATAAAATTGC 4 80 

ACGTAAATCT GAAAACACAT CTATAACACG TGACGTTTCA GGTGAAGGTG TTCAACAAGC 54 0 

A GCTTAAA ^crTAGAAG GTACGACTGC AAGTGTTCCG CCACAAGGTG GACGCAAACA 600 

10 

TCCAAACCAA GAAATGATTC AAATTGATAC AACAAATATC TTATTTATTC TTGGTGGTGC 66 0 

CTTTGATGGT ATTGAAGAAG TGATTAAGCG CCGTCTTGGT GAAAAAGTTA TTGGTTTCTC 720 

15 AAGCAATGAA GCTGATAAAT ATGACGAACA AG C ATT ATT A GCACAAATTC GCC CAGAAGA 780 

TTTGCAAGCC TATGGTTTGA TTCCTGAATT TATCGGACGT GTGCCAATTG TAGCTAATTT 84 0 

AGAAACATTA GATGTAACTG CGTTGAAAAA CATCTTAACG CAACCTAAAA ATGCACTTGT 900 

20 

GAAACAATAT ACTAAAATGC TGGAATTAGA TGATGTGGAT TTAGAGTTCA CTGAAGAAGC 960 

TTTATCAGCA ATTAGTGAAA AAGCAATTGA AAGAAAAACA GGTGCGCGTG GTTTACGTTC 1020 

AATCATAGAA GAATCGTTAA TCGATATTAT GTTTGATGTG CCTTCTAACG AAAATGTAAC 109 0 

25 

GAaGGTAGTT ATTACAGCAC AAACmATTAA TGrAGaACTG AACCAG 1126 
(2) INFORMATION FOR SEQ ID NO: 29: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4392 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATTGACTTCT TAG C AATnAA TaTGAGTGAA GAACGTACTG TTGAAGTACC AGTTCAATTA 6 0 

40 

GTTGGTGAAG CAGTAGGCGC TAAAGAAGGC GGCGTAGTTG AACAACCATT ATTCAA CTTA 12 0 

GAAGTAACTG CTACTCCAGA CAATATTCCA GAAGCAATCG AAGTAGACAT TACTGAATTA 180 

45 AACATTAACG ACAGCTTAAC TGTTGCTGAT GTTAAAGTAA CTGGCGACTT CAAAATCGAA 24 0 

AACGATTCAG CTGAATCAGT AGTAACAGTA GTTGCTCCAA CTGAAGAACC AACTGAAGAA 3 00 

GAAATCGAAG CTATGGAAGG CGAACAACAA ACTGAAGAAC CAGAAGTTGT TGGCGAAAGC 360 

50 AAAGAAGACG AAGAAAAAAC TGAAGAGTAA TTTTAATCTG TTACATTAAA GTTTTTATAC 420 

TTTGTTTAAC AAGCACTGTG CTTATTTTAA TATAAGCATG GTGCTTTTTG TGTTATTATA 4 80 

AAGCTTAATT AAACTTTATT ACTTTGTACT AAAGTTTAAT TAATTTTAGT GAGTAAAAGA 54 0 
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CTTACTAAGC TAAAGAATAA TGATAATTGA TGGCAATGGC GGAAAATGGA TGTTGTCA7T $60 

ATAATAATAA ATGAAACAAT TATGTTGGAG GTAAACACGC ATGAAATGTA TTG7AGGTCT 72 0 

5 

AGGTAATATA GGTAAACGTT TTGAACTTAC AAGACATAAT ATCGGCTTTG AAGTCGTTGA 780 

TTATATTTTA GAGAAAAATA ATTTTTCATT AGATAAACAA AAGTTTAAAG GTGCATATAC 64 0 

AATTGAACGA ATGAACGGCG ATAAAGTGTT ATTTATCGAA CCAATGACAA TGATGAATTT 900 

w 

GTCAGGTGAA GCaGTTGCAC CGATTATGGA TTATTACAAT GTTAATCCAG AAGATTTAAT 96 0 

TGTCTTATA7 GATG ATTTAG ATTTAGAACA AGGACAAGTT CGCTTAAGAC AAAAAGGAAG 1020 

15 TGCGGGCGGT CACAATGGTA TGAAATCAAT TATTAAAATG CTTGGTACAG ACCAATTTAA 10 8 0 

ACGTATTCGT ATTGGTGTGG GAAGACCAAC GAATGGTATG ACGGTACCTG ATTATGTTTT 114 0 

ACAACGCTTT TCAAATGATG AAATGGTAAC GATGGAAAAA GTTATCGAAC ACGCAGCACG 1200 

20 

CGCAATTGAA AAGTTTGTTG AAACATCACG ATTTGACCAT GTTATGAATG AATTTAATGG 126 0 

TGAAGTGAAA TAATGACAAT ATTGACAACG CTTATAAAAG AAGATAATCA TTTTCAAGAC 13 2 0 

CTTAATCAGG TATTTGGACA AGCAAACACA CTAGTAACTG GTCTTTCCCC GTCAGCTAAA 13 8 0 

25 

GTGACGATGA TTGCTGAAAA ATATGCACAA AGTAATCAAC AGTTATTATT AATTACCAAT 144 0 

AATTTATACC AAGCAGATAA ATTAGAAACA GATTTACTTC AATTTATAGA TGCTGAAGAA 1500 

30 TTGTATAAGT ATCCTGTGCA AGATATTATG ACCGAAGAGT TTTCAACACA AAGCCCTCAA 156 0 

CTGATGAGTG AACGTATTAG AACTTTAACT GCGTTAGCTC AAGGTAAGAA AGGGTTATTT 162 0 

ATCGTTCCTT TAAATGGTTT GAAAAAGTGG TTAACTCCTG TTGAAATGTG GCAAAATCAC 16 80 

35 CAAATGACAT TGCGTGTTGG TGAGGATATC GATGTGGACC AATTTCTTAA CAAATTAGTT 174 0 

AATATGGGGT ACAAACGGGA ATCCGTGGTA TCGCATATTG GTGAATTCTC ATTGCGAGGA 18 00 

GGTATTATCG ATATCTTTCC GCTAATTGGG GAACcAATCA GAATTGAGCT ATTTGATACC 18 6 0 

40 

GAAATTGATT CTATTCGGGA TTTTGATGTT GAAACGCAGC GTTCCAAAGA TAATGTTGAA 192 0 

GAAGTCGATA TCACAACTGC AAGTGATTAT ATCATTACTG AAGAAGTGAT CAGCCATCTT 198 0 

AAAGAAGAGT TAAAAACTGC ATATGAAAAT ACAAGACCCA AAATAGATAA ATCAGTGCGC 2 04 0 

45 

AATGATTTGA AAGAAACGTA TGAAAGCTTT AAA TT ATT CG AAAGTACATA CTTTGATCAT 210 0 

CAAATACTAC GTCGCTTAGT AGCGTTTATG TATGAAACAC CTTCGACAAT TATTGAGTAT 2160 

50 TTCCAAAAAG ATGCAATCAT TGCAGTTGAT GAATTTAATC GTATTAAAGA AACTGAAGAA 2 220 

AGTTTAACAG TAGAGTCTGA TTCGTTTATT AG CAAT ATT A TTGAAAGTGG TAATGGATTT 2 2 80 

ATAGGACAAA GTTTTATAAA ATATGATGAT TTTGAAACAT TGATTGAAGG CTATCCTGTC 2 34 0 
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TCATGTAAAC CTGTCCAACA ATTTTATGGG CAATATGACA TTATGCGTTC TGAATTTCAA 24 6 0 

CGATATGTTA ATCAAAACTA TCATATCGTG GTTTTGGTCG AAACCGAAAC TAAAGTTGAA 2 52 0 

CGTATGCAAG CGATGTTAAG TGAAAt GCAT ATT C CATC AA TAACAAAATT GCATCGCTCA 2580 

ATGTCATCGG GGCAAGCAGT GATTATTGAA GGCAGTTTAT CTGAAGGATT TGAACTACCT 2 64 0 

GATATGGGAT TAGTTGTCAT TACTGAGCGT GAgcTTTTTA AATCAAAACA GAAAAAGCAA 27 0 0 

10 

CGAAAACGTA CGAAAGCTAT CTCAAATGCT GAAAAAATTA AGTCTTACCA AGATTTAAAT 2760 

GTGGGAGATT ATATTGTTCA TGTGCATCAT GGTGTTGGTA GATATTTAGG TGTTGAGACG 2 82 0 

is CTCGAAGTGG GGCAAACGCA TCGTGATTAT ATTAAATTGC AATATAAAGG TACGGATCAA 2 880 

CTATTTGTTC CAGTAGATCA AATGGATCAA GTTCAAAAAT ATGTAGCTTC GGAAGATAAG 2 94 0 

ACGCCAAAAT TAAATAAACT CGGTGGCAGT GAATGGAAAA AAACAAAAGC TAAAGTTCAA 3000 

20 CAAAGTGTTG AAGATATTGC TGAAGAGTTG ATTGATTTAT ATAAAGAAAG AGAAATGGCA 3 06 0 

GAAGGTTATC AATATGGGGA AGACACAGCT GAGCAAACAA CATTTGAATT AGATTTTCCA 312 0 

TATGAACTTA CGCCTGACCA AGCTAAATCT ATCGATGAAA TTAAAGATGA CATGCAAAAA 318 0 

25 

TCGCGTCCAA TGGATCGCTT GCTATGTGGT GATGTTGGTT ATGGTAAAAC TGAAGTTGCA 324 0 

GTGAGAGCAG CATTCAAAGC TGTAATGGAA GGAAAGCAGG TTGCATTTTT AGTTCCTACA 3 3 00 

30 ACTATTTTAG CTCAGCAACA TTATGAGACG TTAATTGAGC GTATGCAAGA TTTTCCTGTT 336 0 

GAAATTCAAT TAATGAGTCG TTTTAGAACG CCTAAAGAGA TAAAACAAAC TAAGGAAGGA 34 2 0 

CTTAAAACTG GATTTGTTGA CATAGTTGTT GGTACACACA AATTACTTAG TAAAGATATA 34 80 

35 CAGTATAAAG ATTTAGGGCT GTTGATTGTA GATGAAGAAC AACGATTTGG TGTACGCCAT 3 54 0 

AAAGAGCGTA TTAAAACATT AAAACATAAT GTAGATGTAC TAACATTGAC TGCAACCCCA 3600 

ATAGCTAGAA CATTGCATAT GAGTATGCTA GGTGTGCGGG ATTTGTCAGT GATTGAAACG 366 0 

40 

CCGC CAGAAA ATCGTTTCCC AGTTCAAACA TATGTATTAG AACAGAACAT G AGTTTT AT C 3 72 0 

AAAGAAGCTT TAGAAAGAGA ACTATCCCGT GATGGCCAAG TGTTTTATCT TTATAATAAA 3 780 

GTGCAATCCA TTTATGaAAA ACGAGAACAA CTCCAGATGT TAATGCCAGA TGCTAACATT 3 84 0 

45 

GCAGTTGCTC ATGGACAAAT GACAGAGCGC GATTTAGAAG AAACGATGTT AAGTTTT AT C 3 9 00 

AATAATgAAT ATGATATTTT AGTAACGACG ACGATTATTG AAACAGGTGT CGATGTCCCA 3 960 

so AATGCAAATA CTTTGATCAT TGAAGATGCA GATCGCTTTG GATTGAGTCA GTTGTATCAA 4 02 0 

TTAAGAGGTC GTGTTGGTCG TTCAAGTCGT ATTGGTTATG CATACTTCTT ACATCCAGCA 4 0 80 

AATAAGGTAC TAACTGAGAC TGCAGAAGAT CGATTACAAG CGATTAAAGA ATTTACGGAG 414 0 
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TTAGGTAAAC AACAG CACGG CTTTATTGAT ACAGTTGGAT TTGATTTGTA CAGTCAAATG 



4260 



TTAGAAGAAG CTGTAAATGA AAAACGTGGT ATTAAGGAAC CAGAATCTGA GGTGCCAGAA 



4320 



GTCGAAGTTG ATTTAAACTT GGATGCATAT TTGCCAACAG AATATATTGC AAATGAACAA 



4380 



GCTAAAATTG AA 



4392 



w 



15 



20 



25 



30 



35 



40 



45 



SO 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TTTCTTTTGA ATCTATATCG AGGTGGTTGG TAGGTTCATC TAAAATAAGT ACATTGTCAC 6 0 

GTTGCAACAT AAGTAGTGCT AGTTGTAAAC GTGCTTTTTC ACCACCAGAT AAATCATTAA 12 0 

TTATCTTTTT AACATCGTCT TGTACAAATA AGAAACGTCC AAGAACTGCT CGAATATCTT 180 

TTTCATTCAT TAACGGATAT TGATCCCACA CAT AAT CT AA AATCGTTTTA CTAGATTTAA 24 0 

ATTCTGCTTG CTTTTGATCA TAATAACCAA TTTGTAAATT TGCGCCGAAA GTAATAT CGC 3 00 

CATTAAGCGC TTTTTGTTGA TTAGCAATAG TTTTAATTAA GGTCGATTTT CCAATACCAT 3 60 

TTGGCCCAAT GATTGCTATA TGATCGCCTT TAGAGACCTC TATACTCATA GGTTTGGTAA 420 

TTGCAGTTTG ATAACCGATT TCTAAATTTT TTAGATG GAT GACGTCATTA CCTGTATTCC 4 80 

GGTCAAAGCC AAATTGAATA TTTGCACTTT TGG CATCTAA CATTGGTTTA TCAATGCGTT 54 0 

CCATTTTTTC TAAAATCTTA CGTCTACTTT TTGCCATTCC ACTTGTTGAA GCACGGGTAA 6 00 

TATTTTTCTC AACAAAAGTT TCTAATCGTT TTATTTCTGC TTGTTGACTT TCATATTCTT 660 

GCATTCGTTT TTGATAATAT AAATCCCGTT GCTGTATAAA TTCCTCGTAA TT AC CAACAT 72 0 

AGCGTTTGA 72 9 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 13856 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
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TGATGTTTCG ATACATTTGT TGCACCTTGT GGATATACTT TAAAGGTTGT GTCGTATGTT 12 0 

TCCTTACTAT CTTTAGCTTC AGATTCCTGT GATTCAACCG TTTTATATTT TTCAAGTGCA 180 

TGTCCTTCAA TATCAACTCG TGGAATAATG CGATTCAACC ATGCTGGTAA ATACCACGAA 24 0 

CCTTTtCCAA ACAATTTCGt TAATGCAGGA ATTAACATCA TtCTGACTAC GAAGGCATCA 3 00 

AAGAGTACAC CAAACGCTAA TGCCATACCC ATTGATTTAA TCATGACATC TTCTTGGAAT 3 60 

10 

ACAAACGCAA AGAAGACACT AAACATAATT AATGCAGCTG CTACAATAAC AGGACCGCTT 4 20 

TCTTTCAATC CTACTTTGAT AGAATAATCA TTATCCCCTG TTTTACTATm yyCTTCATGr 4 80 

15 ATTCGCGACA TAAGGAAGAC TTCATAATCC ATCGCTAATC CAAATAAGAT ACCTATAGTA 54 0 

ATAACCGGTA AAAATGCTAG CATTGGTCCT GTCGTTTCAA TACCAAACAG ACCTTTCATA 60 0 

AAACCATCTT GCATTACTAA TGTTGTAAAT CCTAATGTTG C CATTAATG A CAAGACGAAT 66 0 

20 CCTAAAACTG CTTTTAATGG TATTAGAATT GAACGGAAGA CAATCATTAA TAAGAAAAAT 72 0 

GCTAATACAA CAATGACTGA GGCAAATAAA GGTATCGCCT CATTTAACTT TTTAGACATA 78 0 

TCAATATTAA TGACACTTTG TCCCGAAATC TCCGTTTTGA ACCCATATTT ATCTTGTGCA 84 0 

2S 

TCTTTATGAT AATCTCGTAA ATCATGCACT AAATCATTTG TACTCTCTGC ATTAGGCCCT 9 00 

TGCTTAGGTA TCACGACCAT CAAAGCGTAA TCATTATCTT TACTCATTTG TGGTGGCGTA 960 

ACGATATCTA CATTTTTCTT ATCTTTAATA TCTTTATATA CAGACTGTAA ATCTTGTTGT 10 20 

30 

AATCCTTGTG GATCATCCTT TTTATCTTTC ACATTTATCA ACATCGGTAT TTGGCCATTA 10 80 

AATCCTTCAC CAAATTTATC CGAGATAATA TCGTAAGCTT TTTTCTGTGT AGAATCTGCT 114 0 

35 GGTTTAACAC CGTCATCTGG AATACCAAGT CGCATATGAC TAACTGGTAT TGCAGCTGCT 12 0 0 

ACTAATATGA TTAAACCTAG TAATACTGCC GCAAGTGCAT TTCCTGTAAT AAATTTAGAC 12 6 0 

CATGGCG TAT CAATATCTTT TTTGAATTTA GACTGTAATT TATTCACTTT AATGCGTTtA 13 2 0 

40 TGGAAAATGC TTATTAATGC AGGTAATAAA GTTAAAGCGC TAAGTACTGC AAAAACAACA 13 8 0 

CTAATTGCCG AAGCAAATCC CATTACCGCT AAGAAGTCAA TGCCTACTAA TGATAAACCA 14 4 0 

CATACTGCAA TTACAACTGT TACACCAGCA AAAACAACTG CACTACCTGC TGTTCCTATT 15 00 

45 

GCAAGACCAA TGCCTTTAAT GTAATCTGTT TCAGTTTTCA TAACTTGTCG ATATCTGAAT 15 6 0 

AAAATAAATA ATGCATAATC GATACCAACT GCTAGTCCAA TCATTACGGC TAATGTCAGT 16 2 0 

so GTGACATTTG GTATATCGAA TGCATAAGTT AACAAACTGA TAATACCTAC ACCAGAGGCT 16 30 

AGAC CAATCA ATGCACTTAT AATTGGTAAT CCTGCAGCAA TGACTGAACC GAATGTGATT 174 0 

AACAGTACAA CAAATGCAAC AATAATACCA ACTAGTTCAG AATTACCGCC TACTTCTGTA 18 00 
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w 



15 



20 



25 



30 



45 



SO 



AAATGACTTT 


TAACATTATC 


TCTAGAGCCA 


TCTTTTAAAG 


ATGTTTG A CT 


AACGTCATAT 


1920 


GTGATATCTG 


CAAATGCAGT 


TGTTTTATCT 


TTACTAATTT 


GCTTATTTTC 


AT AAGGAT CT 


1980 


GATATTTTAT 


CAATGTGCTT 


GTCATCTTTT 


TTAATATCAT 


CTAACGTTTT 


CTTAATATCT 


2040 


TTAGTAATGT 


TCGGTTGCAC 


AATACCATCA 


TCTTTAGTCG 


TCTTAAAGAC 


AACACGTATT 


2100 


TGTGCCTTTT 


CACTATCTTG 


ATTAAAATGT 


TTTTCAATCT 


TTTTATTCGT 


ATCTAACGAC 


2160 


TCTAATCCTG 


TCATTTTAAT 


ATCATTGTCA 


AATTTCGGTG 


CATTTGTAGC 


AAGTGGTATC 


2220 


AATATTGCAG 


CTACAATCAC 


TATCCATGCA 


ATGACCGCGG 


ACCATTTATG 


TTTTGCGATG 


2280 


AATGTCCCCA 


TCTTATATAA 


AAATTTTGCC 


AAAGTATATT 


GCCTCCTTTT 


AAAATCAACG 


2340 


TTATAGTTTA 


AATATACAGT 


GTAGATTATT 


G TTCGATTAT 


AGTATCTATC 


CCCGACCTCT 


2400 


TAAAGAATCA 


ATTGGAAAAT 


TTTGTATATT 


AAACTACACA 


CAAAGGAGAA 


ATGTAGATGA 


2460 


AAGAGACTGA 


TTTACGAGTT 


ATAAAGACAA 


AAAAAGCATT 


GTCGAGTAGC 


TTGCTACAAT 


2520 


TGTTAGAACA 


GCAATTATTC 


CAAACGATTA 


CTGTCAATCA 


AATTTGCGAC 


AACGCACTCG 


2580 


TACACCGTAC 


AACATTTTAT 


AAACATTTTT 


ATGATAAATA 


TGATCTTCTA 


GAGTACTTGT 


2640 


TCAATCAATT 


GACTAAAGAC 


TACTTTGCTA 


GAGATATCAG 


TGACCGTCTT 


AATCATCCAT 


2700 


TCCAAACGAT 


GAGTGATACG 


ATTAATAATA 


AAGAGGATTT 


GAGAGAAATC 


GCAGAATTCC 


2760 


AAGAAGAAGA 


CGCTGAATTT 


AATAAAGTAT 


TAAAAAATGT 


CTGCATTAAA 


ATTATGCATA 


2820 


ACGATATCAA 


AAATAATAGA 


GACCGTATCG 


ATATTGACAG 


CGACATCCCA 


GATAATCTCA 


2B80 


TATTTTATAT 


TTATGACTCG 


TTGATTGAAG 


GTTTTATACA 


TTGGATAAAA 


GATGAAAAAA 


2940 


TTGATTGGCC 


TGGCGAAGAT 


ATTGATAACA 


TTTTCCATAG 


ATTAATCAAT 


ATTAAGATTA 


3000 


AATAGTAGAT 


GAGAAACTCA 


TGAGCGTTAC 


CAACATTCAT 


AATAAAAACG 


ATAGTGkACA 


3060 


CGTTAATGAA 


TTCGTGTACT 


ACTATCGTTT 


TTTATTTTTA 


TCGTGCTTAT 


CGCTATTAAA 


3120 


ACAACTGATA 


CACAACACAT 


AAACTATGAA 


GAAAAAAATA 


AATCCGCTAT 


CTAAATGACT 


3180 


TTGACTCAGT 


TGTTTAAATG 


ACCAAATTGC 


TAATACAATT 


CCCATTATTA 


TTGAAATAAC 


3240 


GTATCTCACA 


TTCTTATACC 


TATAATCCTT 


TTCTAAAAAT 


ATGGTTGCTA 


TTACTTAATT 


3300 


TTTAAAGTTA 


TAAATAAAAA 


GAGCCAACCG 


CAATGGATGG 


CCCTTGTTCA 


TTATGAAGCA 


3360 


TTAGAACATT 


TCTGAAACAA 


CCTTTTGTTC 


TAAGAAGTGT 


AATAAGTAGT 


CTGGACTACC 


3420 


TGTTTTAGCG 


TCCGTACCTG 


ACATTTTGAA 


ACCACCAAAT 


GGATGGTATC 


CAACAACTGC 


3480 


TGAAGTACAG 


CCTCTGTTAA 


GGTATAAATT 


GCCTACATCA 


AATTCGTTTA 


CCGCTTTAAT 


3540 


CCAATGCTCG 


CGATTATTTG 


TAATCACTGC 


ACCAGTTAAA 


CCGTAATCTG 


TATCATTTGC 


3600 
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TTCTTCTTGC ATGATTCTAT CTTTAGATTT 
GTAACCTTTT GAATCATCAG TGCCGCCACC 
5 CTCAATATAA TTTTTAATCT TATCAAATTG 

ATTGTCTACA GTATTGCCCA ACGTTAATTC 
TTCGTCATAA ACGTCTTTAT GCACAATTGC 

w 

AAAACCAAAT GCTGACGTTA CAATAGCTTC 
AACTA CAATG GCATCTTTAC CACCCATTTC 

15 TTCTTGAACA ACGGCACTAC GTTCATAAAT 

TGTAACGAAA TGCGTATCTT TATGATCAAC 
AGGAACAAAG TTAACTACGC CTTTTGGTAA 

20 ATAAGCGATA TAAGGTGTAT CCTCAGCAGG 

TGGTGCTAAA GTTGTACCAG CCATAATCGC 
ACCTGTACCA ATTGATTTAT AGAAATATTT 

25 

CTTACCTTGA GCCAAGTCCA TCATTGAACG 
AGCTGCATCA CCAACTGCTT CAT CC CAT GG 
AATTTCCGCT TTTCGACGAC GAATAATTGC 

30 

ATTTGCTGAC CATGTTTTCC AAGATTTATA 
AACATCTTGT TTTGTTGCCT TTGATGCATT 

35 GATTGATTTA ATTTTGTCAT CTTTGAAAAT 

TTGACCTAAT TCTTTTTCCA CGTCTTTCAA 
GACTGAAAAA TCGTAACCAG GTTCATTTTT 

40 ATAAATTTTG AAAGTGGTTT AACCCTTTGA 

TTACTATGAT TAAGGTTAGT TTTGCAATCG 
CAAGTATTTT GAAATTGATT GGTTACTTTT 

45 

TATCGTTTCG TCATTTAATG TTT CGGATGG 
ACAAGGGTTT CCAACCGCTA AGCTGTGTGG 
50 ACCAATCACA CTGCCTTCTC CAATCGTCAC 

CCAAGTATTA CTGCCAATAT GAATGGGTCC 
GAAATTAAGT GGATGTGTCG CTGTGTAGAA 

55 



AAGTCCTGAA ATGATTGTTG GTTCTACAAA 3 72C 

TTGTTCTAAT TTACCTTCTT CTTTACCAAT 3780 

TTTTTTATTA ATAACTGGGC CCATATACGT 3340 

TTTTGTTAAT TTGATTGATT TCTCTAATAC 3 900 

ACGTGAACAT GCTGAACATT TTTGACCAGA 3 960 

TGCTGCCATA TCTGTATCAA TATTTTCATC 4 020 

AGCGATAACA CGTTTCAAGA AGTTTTGACC 4 080 

TCTAGTACCT GTCGCACGTG ATCCTGTAAA 414 0 

TAAGTAATCA CCAATTTCTT TCGGATCACC 4200 

TCCTGCTTCT TCTAAAATTT CCATTAATTT 4260 

TTTCAATAAC ACTGTATTAC CTGCCACAAC 4 3 20 

AAACGGGAAG TTCCACGGCG GAATTGTAAC 4 3 80 

ATTGTGTTCA CCTTCACGAT CAAGTACTGG 4440 

TGCATAGTAT TCAATAAAAT CAATACCTTC 4 500 

CTTACCTGCT TCATAAACCA TAATTGCTGC 4 560 

CGAAACACGT AACATAAGCT CTGCACGATC 4 620 

AGCTTCGTTT GCTGCTTTAA ACGCATCTTC 4 6 80 

TGCAATCACT TGTGATGTGT CTGCAGGATT 4 740 

CTTCTCTCCA TTAAT CACT A ATGGTATGTC 4 8 00 

TGCTTTCTTA AA CAT AT CCA CATTTTCTTG 4 8 60 

AAATTCTACT ACCATGTACA CTTACCCCCT 4 920 

TTTAATGATA TAACATCATT TAAACTCATT 4 980 

CTTTCATTTT TATGTTTTAT CACTTATTCT 5 04 0 

TAAAATTTAT ATGGGTCGCA ACTGCTACTT 5100 

TAGGTCATTA TCAATTTTAC GAACGACTTT 5160 

CGGAATATCT TTAGTGACAA CACTACCAGC 52 20 

CCCTGGTAAC ACGGCTACAT GACCGCCAAA 5 2 80 

GG CTTTTTCA AAAC CTTCAT TTCTATGATG 534 0 

TCCACAATTA GGTCCTATAA AAACATTATC 54 00 
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TCCTAGTTTA ACGTTCCAAC CATAATCTGT AT C AAAAGG A ATCGAAATAC TTACATTGTC 55 2 0 

TGTTGTTGTT TGAAATAATT GATCAATTAA TTCCTTTCTT TTATTTGTAG CACTCGGTCT 55 8 0 

5 

TGTATGATTT AATTCAAAGC AAA TAT CTTT CGCTCGTGCA CGTTCATTGA TTAAGTATTG 564 0 

ATCAAAGTTT GCATCGTACC ATTTTTCTGC TAACATTTTT TCTTTTTCAG TCATTACACC 57 0 0 

TTTCAACTCC TAATAACTTA TTTACTTGTT TAAAAGTTAA TCAAATAAAC CTTCGCCTAT 57 6 0 

10 

GCAACTAATA CGCTATAACA TTATGAAATC ATGACCTTAT CACCCTTATC TATACAATTC 58 2 0 

TCGCATCAAA TACTGCTAAA GTAGTAGATA AATTCAATAC TACAGACGCA TTCATTTTTT 58 80 

is AATCTATTAA CGTACAATGT GAGTAAGAGA AATATAAAGG AGTATGATAG CGATGAGAAT 594 0 

ATTAATTACA GG CACAGTTG CTATCTTAAT CATTCTAGGT TTGGTCAAAA CGATACAAGA 6 000 

TTACGAAATG ACAAACGACA CGAGTCGTcA GTTGTCAGAC AACAAAGATG ATGATAAAGT 606 0 

20 

CATCCATCTT AATAATTTTA AAAATTTACA TGCGAAAGAA TTTAACCCAT CTGATTTCTT 612 0 

TTAAGTCACC TAAGAATTGC AAATCCAGAA GTCATTTAAG TTTTACCTTT CATTCATACA 618 0 

TCCTTTAATA TTAATTACGA CTTCTTTTAT ATAGATGCTA AGTAGAGAGA TTGTTGTGCA 624 0 

25 

ATGTTTGCAC GGCAATCTCT CTTTTT CTTT TTAAAATTGG TAAAAGTAAA ACGCAACGAT 63 0 0 

TGACTTATAT ACCTATAGGG GGTACATTAG ACGTGTAACA ATGAATCACA GGGAGGCAAT 636 0 

30 AATGTGGCTA ATACGAAAAA AACAACATTA GATATCACTG GTATGACTTG TGCCGCATGT 642 0 

TCAAATCGTA TCGAAAAGAA ACTGAATAAA CTTGATGACG TTAATGCCCA AGTGAATTTA 64 8 0 

ACTACAGAGA AAGCAACTGT TGAGTATAAC CCTGATCAAC ATGATGTCCA AGAATTTATT 654 0 

35 AATACGATTC AACATTTAGG TTACGGTGTC G CTGT AG AAA CTGTCGAATT AGACATTACA 660 0 

GGTATGACTT GTGCTGCATG CTCAAGCCGT ATTGAAAAAG TGTTAAATAA AATGGACGGC 66 6 0 

GTTCAAAATG CAACGGTCAA TTTAACAACA GAGCAAGCTA AAGTTGACTA TTATCCTGAA 672 0 

40 

GAAACAGATG CTGATAAACT TGTCACTCGC ATTCAAAAAT TAGGTTATGA CGCGTCTATT 6 7 80 

AAAGATAACA ATAAAGATCA AACGTCACGC AAAGCTGAAG CGCTACAACA TAAATTGATT 6 84 0 

AAGCTTATCA TATCAGCAGT ATTATCTTTA CCACTATTAA TGTTAATGTT TGTACATCTT 690 0 

45 

TTCAATATGC AT AT AC CAG C ACTATTTACG AATCCATGGT TCCAATTTAT TTTAGCTACA 6 96 0 

CCTGTACAAT TTATTATTGG ATGGCAATTT TATGTAGGTG CTTATAAAAA CTTAAGAAAT 7 02 0 

50 GGTGGCGCCA ATATGGATGT ACTTGTTGCT GTTGGTACAA GTGCAGCATA TTTTTACAGT 70 8 0 

ATTTATGAAA TGGTTCGTTG GCTAAATGGC TCAACAACGC AACCGCATTT ATACTTTGAA 714 0 

ACAAG CGCCG TACTAATTAC CTTAATCTTA TTCGGTAAGT ATTTAGAAGC TAGAGCGAAG 7200 
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TTAAAAGATG GTAATGAAGT GATGATTCCT CTAAATGAAG TACATGTTGG AGATACACTT 7 320 

ATCGTTAAAC CAGGTGAAAA GATACCTGTT GATGGCAAAA TTATTAAAGG TATGACTGCC 7380 

ATCGACGAAT CTATGTTAAC AGGTGAATCT ATCCCTGTTG AGAAGAATGT TGATGATACT 744 0 

GTAATTGGTT CAACGATGAA CAAAAACGGT ACTATTACTA TGACAGCAAC AAAAGTTGGC 7 500 

GGGGACACTG CGTTGGCAAA TATTATTAAA GTTGTCGAAG AAGCTCAAAG TTCTAAAGCG 7 56 0 

w 

CCGATTCAAC GATTGGCAGA TATTA7TTCT GGTTATTTCG TTCCTATCGT TGTTGGTATC 762 0 

GCACTATTAA CATTTATCGT GTGGATTACT TTAGTTACAC CAGGTACATT TGAACCTGCA 76 80 

, s CTTGTTGCGA GTATTTCCGT TCTCGTCATT GCTTGTCCAT GCGCATTGGG ACTTGCTACA 774 0 

CCAACTTCTA TTATGGTAGG TACTGGTCGC GCTGCTGaAA ATGGTATTTT ATTTAAAGGT 7 8 00 

GGCGAGTTTG TTGAACGCAC ACATCAAATT GATACCATCG TTTTAGATAA GACGGGTACC 786 0 

20 A7TACAAATG GTCGTCCAGT CGTGACAGAT TATCATGGTG ACAATCAAAC GCTACAACTA 7 920 

CTTGCTACTG CTGAAAAAGA TTCTGAACAC CCATTGGCAG AAGCCATTGT CAATTATGCA 7 980 

AAAGAAAAGC AATTAATATT AACTGAGACA ACAACATTTA AAGCAGTACC TGGCCATGGT 8 04 0 

25 

ATTGAAGCAA CGATTGATCA TCACCATATA TTGGTTGGTA ACCGTAAATT AATGGCTGAC 8100 

AATGATATTA GCTTGCCTAA GCATATTTCT GATGATTTAA CACATTATGA ACGAGATGGT 816 0 

AAAACTGCTA TGCTCATTGC TGTTAATTAT TCATTAACTG GTATCATCGC AG TGGCAGAT 822 0 

30 

ACTGTCAAAG ATCATGCCAA AGATGCTATA AAACAATTGC ATGATATGGG CATTGAAGTT 82 8 0 

GCCATGTTAA CTGGCGATAA TAAAAACACT GCTCAAGCCA TTGCAAAACA AGTAGGCATA 834 0 

35 GATACTGTTA TTGCAGATAT TTTACCAGAA GAAAAAGCTG CACAAATTGC GAAACTACAG 84 00 

CAACAAGCTA AGAAGGTTGC GATGGTTGGT GACGGTGTAA ATGATGCACC TGCATTAGTT 84 6 0 

AAAGCTGATA TCGGTATCGC CATTGGTACA GGTACAGAAG TTGCCATTGA AGCAGCTGAT 8520 

40 ATTACTATTC TTGGTGGCGA CTTGATGCTT ATTCCTAAAG CCATTTATGC AAGTAAAGCA 8580 

ACCATTCGTA AT ATTCGT CA AAATCTATTT TGGGCATTCG GCTATAATAT TGCCGGTATC 86 4 0 

CCTATAGCTG CATTGGGCTT ACTTGCGCCA TGGGTTGCTG GTGCTGCAAT GGCACTAAGT B700 

45 

TCAGTAAGTG TTGTCACAAA CGCACTTAGA TTGAAAAAGA TGCGATTAGA ACCACGCCGT 876 0 

AAAGATGCCT AGATTCCTTA ATAATGAAGG ATTCGTTGGT GATTCTGAGA TAGGCTAGTG 882 0 

50 ATTGGCTCTA TAATGTCGCG GTTTAyaGTt GGATCTTCGC TCCAACTGCA TATATAGTnA 888 0 

CACTTTTCGC TTGGCGAATT AGTGTATCTT ACCTAATAGc TCCGCCTATT AGGTTCCATC 8 94 0 

ATTATTATAA ATAATAAGTA CACTACGGtT TA'CAG TTGGA TCTTCGCTCC AACTG CAT AA 90 0 0 
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so 



GAAATTTTAA ATGTTGAAGG TATGAGCTGT GGTCACTGCA AAAGTGCTGT TGAATCTGCA 9120 

TTAAATAATA TTGACGG TGT CACTTCAGCT GACGTTAACC TTGAAAATGG TCAAGTAAGT 9180 

GTTCAATATG ATGACAGTAA AGTTGCTGTA TCTCAAATGA AAGACGCAAT TGAAGATCAA 924 0 

GGTTACGATG TCGTTTAATT AGGCAATATT CAACGTCATC AACACCAAAT TAAAAAATCG 93 0 0 

AACTGATGAG AATCCCAACA ATC CAAATT A 7CTCATCAGT TCGATTTTTA ATTTACTCGT 9360 

AACCTAGTAT CTCCAGTCTG CAATACATCT AAT G TTG CAT CTAATGCATC GACAATTAGA 94 2 0 

TTTTTAACTG CAGCTTCAGT ATAAAACGCA ATATGTGGTG TTAATATGAC ATCTTCCCTG 94 8 0 

T CAATCAA CG ATTCTAACAA TGGATCGTTC AGTGTTTTGC CCCTTTGATC ACTTGGGAAA 954 0 

AGTTTGCGTT CAAATT CATA CGTATCAAGT GCTGCACCTT TAATCACACC ATTGTCTAAT 960 0 

GCGTCTAATA ACGCCTTAGT ATCTACTAAA GAACCTCTCG CACAATTGAC AAATACTGCG 966 0 

CCCTTTTTAA AATGTTTAAA TAATTCAG CA TTAAATAGAT AATGATTATA TTTCGTTGCA 9720 

GGTACATGTA ATGTCACGAT ATCAGCACCT TCAACCGCTT CCTCAATCGT ATCTTTGTAA 978 0 

TCGACATACG TTGCAATTTT AGCATTAGGA AACGGt CGTA TGCGACCACA TCACTTTGAT 984 0 

AACCATTGGC AAATATATCG GCTACTACAC GGCCAATTCG ACCTGTACCA ATAACAGCTA 9900 

CTTTTAAATC TTTAATGGAT TTCGATAAAA TAGTAGGTTC CCATCTAAAA TCATGcTCCC 9 96 0 

GCACTTTCGT TTGAATTTGA TTAAAATGAC GAACCACATT AATAGCCTGG TTCACAGCAA 1002 0 

ACTCCGCAAT TGAATTCGGA GAGTATGACG GCACATTTGA CACAATAAAG TTATACTTGT 1008 0 

TTGCTAACTC CAAATCATAT GTATCAAATC CAGCACTACG TTGTGCGATT TGTTTAATAC 1014 0 

CTAGTTCATT TAATCGTTTA TAAACATGCT CTGATAATGG TATTTGTTGT GATAGCGATA 10200 

AGCCATCATA ACCAGCGACA CCTTCAACAT TGTCATCAGT TAATGCTTCT TTAGTAATAT 10260 

CTACCTCAAC ATGATGTTTC TCTGCCCACG CCTTGATATA AGGCATATCT TCATCACGTA 103 20 

CACTCATGAT TTTAATTTTT GTCATTTTAA CATCACCCTT AACTTTATTA TT CAT AT AAA 103 8 0 

TATGCTAGTT CTGTTAATCT TATTGCAGCT TCGTCTAATT TCTGGTCATC TAACGCCAAT 10440 

GAAATTCTCA CATAACGATT ACCATTCTCT CCAAATGGTT TCCCTGGAGC AACAAGTATT 10S00 

GACTTCTCTT GCACTAAAAA TTGCTCAAAT TGCTCGCTGT CATAACCAGG CGGTGTTTCC 10 56 0 

AAC CATA CAT ATATGCCACC TTTAGCATGA ACAAATGGCA AATCAGCTTT TGCAAGCATG 10620 

GCTTCGAATC GGTCACGACG TGTTTTAAAT ACATTGCTTT GTTCTTCTAA AAAATCATCA 106 80 

TAATGATTCA AAG CAT AT AT TGCGGCATCT TGTAATGCAC CAAACATCCC AGCATTTGTG 10 740 

TGCGTTTGGT ACTTTTTCAA AGCTTGAATC ATATCTTTAT TACCAACTGC AAAACCGACT 10 800 
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CCATTTTCCG AAGCAAGTAT ACTAGGATTT TTAGCGTCGA AAGCGAAAGC ACCATAAGCA 10 920 

AAATCATGCA CGATTTTAGT GTCTGTACCT TTAAATTTAG cTATCGCTTC ATCAAAAACT 10 9 80 

TCTTTCGTAG CTGTCGATCC AGTTGGATTA TTTGGATACG TTAAATAAAT GAGTTTTGTT 11040 

TTATCTATTA TTTGTGAATC AACTTTGGAC CAATCTGGCA AATAATG TGG C GGTT CT AAA 1110 0 

TTAAGCGGGA CTGGCTTGCC AT CAG CT AAA AGTACACCTG CTAAATAATC CGTGTAGCCT 11160 

w 

GGATCAGGTA GTAATACATA GTCTCCTGGA TTGATAACAC ATGTTGGTAC TGCCACTAAT 11220 

CCATTTTTTG TACCATATAA AATGCATACT TCATCTTCTT TATCTAACGT CA CAT TATA T 1128 0 

15 TGTCTTTGAT AAAAATCTAC AATAGCTTGC TTGAACGCTT CTTTACCATG AAAAGCACCA 1134 0 

TATTTTTGAT TTTCAGGAAT AGTTAGTGCT TTTTGAAAAT GATCAATAAT AC CTTGTGG C 114 00 

GTGGGCCCAT CAGGGATTCC AACTGCCATA TTAATTAATG GCAATGGTCC ATGTTCGATT 114 6 0 

20 TTACGTCCCA TCGTTTTCCC GAAATAACTA TCAGGGATAT TTGCTAATTT GTTAGAGATC 11520 

ATCAAATTCC TCCTCTATCA TTAAACATAG CCTGGGCGAC TATCATAATC CTAACAACTT 11580 

GTATCACTCT CATTTAGATG GTTACAATGA CATCGCCATT CACCGTTATG TTCAACAGAA 11640 

25 

CTTATGACAC ACGTTGTATT GAATGAATTT ATTTTCATTT TAGGTAGGTA TAATATTATT 11700 

GTCAATATTA GGAATTTTCA GATTAATATG CACTCAATCG TTATGATTTA ACTGTCATGC 11760 

ATATCCGCAT GCGCAACCAG TTAGATATGC TTATATAAAG TATAACGCCC ATCAAGGTAC 11820 

30 

GTATTCAAAC GTGAACCTTA ACAGGCGTCA TTCATTGTTA AATAAAACTT CTTAAGCACA 11880 

TACTTATTTC ACTATGCCTT TTACGTTCCC CTTATACTTT TCTCACATCT TTCTCTTAGA 11940 

35 CTACTCCCTT ATACGCCCCG CTCAATATCT TTAATCATTT CATCTACAGT TATTTTCGCA 12 00 0 

CTCGTTAAGA CAATAGGAAC GCCTGCACCT GGATGCGTAC TTGCACCTGC AAAATATAAA 12 060 

TCTTTATAAT CTCGCGATAC ATTTTGTGGA CGATAATAAT TACTTTGCGC TAAAGTTGGC 12120 

ATTAAACCGA ATGCCGAACC AAATTTCGCA TGATACGTTT GCTCAAAATC ATTTGGCGTA 12180 

AAGATTGTTT CTGAAACAAT ATGCGATTTT ATATCTTCAA ATACTTCAAT CGTTGCTAAT 12 240 

TTACGATAAA TAATTTCCTT TATTTGTTGC GTCAAAGCTT CATCTGACCA ATCGATTCCG 12 300 

CTACCTGTTT TAAGTTCCGG CGTCGGCATT AG C A CAT AAA TACCAGTTTT GCCTTCTGGC 12360 

GCAAGTGATT TATCAGCGAC CGCTGGTACA TACACATAAA TAGAAGGATC ATATGATAAA 12420 

CGTCCCTCAA ATATTTCTTC AATATTGCCT CTAAAGTCAT CTGAAAAAAT AACATTATGA 12480 

AGTCTCACTT GATCTGTCAC ATCAATATCT ATACCGATAT ACATTAAAAA TG CTGAACAA 12540 

GAGTAATCTA AGTCTGCAAT TTTATGTGGT GGATACTTTT TAATAGGTGC AAAATCTGGC 126 00 
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ATGTCACCAT TCACTTTTAT CGCATCGGCC CGTTTGAATT TAGGATCAAT AATAATTTGC 12 720 

TCAATTTCAG CATTTAGTTC AATATTAACG CCTAAGTCTT TATTTAATTG CGCTAGcCCT 12 7 80 

TGAGCCATGC CAT A CAT AC C GCCTTTAATA AAA TGCACAC CAAACATCAT TTCAATCATA 12840 

GGAATAATTG AATATAGTGA CGGGCCTCGT TTTGGATCAA TTCCTATGTA TAACGTTTGA 12 900 

AACGCTAAAA GCTTTTGTAT CTTTTCGTTA TCAATATAAT GTTCAATTAG CTGATCTGCA 12 960 

w 

TGATTTAACG TTTTTAACTT AGCACCTTGC ACAAGTGACG T CAT ATT AT A AAAGTCACTC 13020 

GGTTTGCGAT ACGTTCTTTC TAAGAAATAG CGACGTGCAA TTTCATATTT TTTATAAACA 13080 

,5 TCCGTTAAAA AGGACATAAA ACCATGCGTT GAACCAGGTT CTATACTTTC TAGCATTTGC 1314 0 

TGTAATTCAG CTAAATCTGT AGGCACCGTT ATACGATCAT CGTGGTCAAA ATACACATCG 13200 

TAAATATAAC GTAATTGTCT CAATTCAATA TAATCTTCAT AATTTTTACC ACACGCTGTA 13260 

20 AAAACATCTT TATAAACATC TGGCATCATG ACAATTGTGG GACCCATATC AAATGTAAAG 13 3 20 

CCGTCTTTCT TTAATTGATT CATACGCCCG CCTACATTAT TATTTTTTTC AAATATCGTC 13 3 80 

ACTTCATGAC CTTGAGAAGC AATACGGGCT GCCGCTGCTA ATCCTGTGAC ACCTGCACCA 13440 

25 

ATTACTGCAA TCTTCATTAT TCAACCACCT ATATTCTATG ATATTTACTA TTTATTTCAT 13500 

GAAACAACTT TGCCTTTTTC CTCTTATCCA CAAAAACACG TTCATGTAAT GTATAGTTAG 13 560 

CCTGTCTCAC TTCGTCCAGT ATTTCAATAT ATATACGTGC TGCTAATTCT ATGATTGGTT 13620 

30 

GTGCTTCAAT ACTAAATACT TTGATTTGAT CCATAACATC TTGAAAATCT TTTTCTGCGA 13680 

TAGCTGCATA ATATTCCCAT AAGTCAATAT AATGATTATT AACACCATTT TGGTACACTT 13 740 

35 CAGCAATATC AACTTCATAT TGCTTTAATC GTTGCTTACT AAAATATATC CGTTCATTGT 13 800 

CAAAATCTTC AC CG ACATCT CTTAATATAT TAAnGGGATC CTCTAGAGTC GACCTG 13 856 
(2) INFORMATION FOR SEQ ID NO: 32: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10088 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 

fxi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

so ATATATAAAT ATAGATTAAG TATATAGATT AATCAACTTT TTTGGAAGAG CAAATCACGC 6 0 

AATCAACAAA TAATATAAGA AGTTTTTGCG ATAGTTTTAA AAT AG CTGT A ATAGAATACT 12 0 

AAATGTGACA AACTTAGAAC TAATATCAAG TGTTGATGTT TTGAATATAA AAATGCTAAT 180 
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ATAATTGGTT AATATATGAG TAATTAGAAA 
AATATGAAAG ATTATGGGTT AACAGGCATA 
CGTGCGTTAA ATCGTGGAAG ATGTAAACCA 
GATATTTGCA AACCATTAAC GATATATGGC 
ATTTTACGCC GATGTCATTC TGGTCCTTTA 

10 

CGTGGTTATA ATGGACACAG TCATATTCAT 
GTATCGTATC CTTATAACAA TACAGCTATG 

;5 ATAGGTGTGA CCATTAAGAA TGTAGTGAGT 

GGACTCTATA TTAAAAGCTG TTCATTTGAA 
ATTCTGAAGC AATACAATTA GACATTCAAG 

20 CAGATGGTAC GATAACGAAA AATGTCATTA 

TGCCCGAAAT GGGAAGTTGG AATCGTGCTA 
ACTATGAGAA TATTCATATT AGAAATAATA 

25 

CTCCCTTGaA GTATAAAGAT GCTTTCATTA 
GCATTAGATA TTTAGGAGTT AGAGATGGTA 
ACTTAGGTTC CCAAGCAGGC ATAAATATGA 

30 

TGTCTAAAGA TGCGATACAT GTACGTAATT 
TCGTTGGGAA TACATTCAAT AATTCGACTC 

35 TGTTTTTAAG TCCTGTTGAA GCGGGTATTC 

AAAAGTAAAA AGTTTCGCAT GACATTAGGA 
ATTGATAAAA CGGTATAAAT ATGCTATAAT 

40 TGACGGTAAT GATAATACAA GATAGACAAC 

AGCTTGTCAT AATCATCATG AGGGGGAAAT 
TGATATCGAA AAGGTATTTA ACATTCTTTT 

45 

CGGGACAACT TATTGGACTA ATATTAGGTC 
TTCATCCACA AGACTTACCT TGGAAAGGCG 
50 CGACTTGGTG GATTACTGAA GCAATTCCTA 

TATTACGA7T AGGTCATATA CTTACACCAG 
TTATCTTTTT GTTTTTAGGT GGATTTATTT 
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ATAGACAAAG GATGACGATT TATGTATATC 3 00 

AACAAAACTA AAGATACTCG AGCAATACAA 3 60 

ACGACAGTTT ATATACCGAA AGGGACGTAT 4 20 

AAT A CAACAC TTTTGTTAGA TAATGAAACT 4 80 

TTAAAAAATG GTCGTCGCTT TGGTTTTTaT 54 0 

ATTAAAGGCG GCAAGTTTGA TATGAATGGT 6 00 

TGCATTGGGC ATGCTGAAGA TATTCAATTA 6 60 

GGTCATGCAA TTGATGCTTG TGGGATTAAC 7 20 

GGATTCATAG ACTATAGTGG CGAACcTTTT 7 80 

TACCTGGTGC TTTTCCAAAA TTCGGAACgA 84 0 

TCGAAGATTG TTATTTTGGA CCTTCAGAAT 900 

TTGGCTCACA TGCAAGTAGA CATAATCGAT 960 

TATTTGAAGA TATACAAGGT TATGCATTAA 10 20 

TTAATAATAA GTTTATTAAC TGTGaGGGTG 10 80 

AAAATGCAGC AGATGTGaTG ACAGGaAAAG 114 0 

ATATAATTGG AAATGAATTT AAAGGATCAA 12 00 

ATAATAATGT TAAACATAAA GATGTATTAA 12 60 

AATCAATTCA TTTAGAAGAT ATTGATACAG 13 20 

AAGTTACTAC AATCAATGTA GATGAAATAA 13 80 

TTAAGAATAG TAGATAATTT TTGAAAGCGC 144 0 

AAACCCAATT ATCTGATAAA AGGGGTATTT 1500 

TTTCTATACT CTAATATAGT GAGTTGAAGT 15 60 

TTATGGCTTA TTTCAATCAA CATCAATCAA 16 20 

CAAAATCAAA GAAAAAGAAA CCGTTTAGTG 16 80 

CATTACTTTT CCTATTAACA TTATTATTCT 174 0 

TCTATGTTTT AGCGATTACT TTATGGATTG 18 00 

TTGCAGCAAC GAGCTTATTA CCAATTGTGT 1860 

AACAAGTATC ATCCGAATAT GGCAATGATA 1920 

TGGCAATTGC AATGGAAAGA TGGAATTTAC 1980 
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TT GG ATT CAT GGTGGCAACA GGATTCTTAT CTATGTTTGT ATCGAACACT GCAGCTGTAA 2100 

TGATTATGAT TCCGATTGGT TTAGCAATTA TTAAGGAAGC ACATGATTTA CAAGAAGCCA 2160 

5 ATACGAATCA AACAAGTA7T CAAAAGTTTG AAAAATCTCT AGTTTTAGCA ATTGGCTATG 2220 

CAGGTACGAT TGGTGGCTTG GGTACATTAA TCGGAACCCC GCCATTAATT ATTTTAAAAG 22 80 

GACAATACAT GCAACATTTT GGACATGAAA TTAGTTTTGC TAAATGGATG ATTGTAGGGA 2 34 0 

w 

TTCCAACGGT CATTGTTTTG TTAGGTATTA CTTGGCTCTA TTTAAGATAT GTTGCGTTTA 24 00 

GACATGATTT GAAATATTTa CCTGOTGGTC AGACGTTAAT TAAACAAAAG TT AG A CG AG C 2460 

TTGGCAAAAT GAAGTATGAA GAAAAGGTAG TACAAACTAT CTTTGTACTT GCTAGCTTAT 252 0 

TATGGATTAC AAGAGAGTTT CTTCTGAAAA AATGGGAAGT TACGTCATCT GTTGCAGATG 2 58 0 

GTACGATTGC TATTTTTATA TCAATATTAT TATTTATTAT TCCAGCTAAA AATACTGAAA 2 64 0 

20 AACATCGCCG TATCATTGAC TGGGAAGTTG CAAAAGAGCT CCCTTGGGGT GTATTAATTT 270 0 

TATTTGGTGG CGGTTTAGCA TTAGCGAAAG GTATTT CTGA AAG TGGTTTA GCAAAATGGT 276 0 

TAGGCGAACA GTTGAAATCA TTAAATGGTG TTAGTC CGAT TCTTATTGTA ATTGTCATAA 282 0 

25 

CAATCTTTGT CTTATTTTTA ACTGAAGTGA CATCTAATAC TGCAACTGCA ACGATGATTT 2880 

TACCGATTTT AGCAACGTTG TCTGTTGCTG TTGGAGTGCA TC CATT ACT A CTTATGGCAC 2 94 0 

CTGCAGCTAT GGCGGCTAAC TGTGCATACA TGTTACCAGT AGGGACACCA CCGAATGCAA 3 0 00 

30 

TT AT CTTTGG TTCTGGTAAA ATATCTATCA AACAAATGGC ATCAGTAGGA TTCTGGGTAA 3 06 0 

ACTTAATCAG TGCAATAATT ATTATTTTAG TCGTGTATTA TGTAATGCCT ATAGTTTTAG 3120 

35 GTATTGATAT AAATCAACCA CTGC CATTG A AATAGTAATT GCAGATTAGA ACGAAAAATA 3180 

AAAGGTTACA TTAGCAATTG CTTGGACGAG TGGTAACGAA ACGTATACCG CAGCATCGTG 324 0 

TAA5AACAAT ACAAACAAAA GAAAGT CAAC CAAGGATGGA TTCCTATTTT AATCCTTGGT 3 3 00 

40 TGACTCTTTA TTTTATTTAA ATTGTAGAAC CTAGAAAATA AAGTTTAATT AAAAGCACCA 3 36 0 

ATCATTTCTA CTTTGAAATC TAAGGTTTCT AAAATAGCAA TGACTTTCTT TATATCGGTT 34 2 0 

GTAATTGCAG AATCAGCCTG AACGAAAAAT CGAT A CAT AC CTAATTGTGT TTTTAAAGGA 3 4 30 

45 

CGAGACTCAA TCCAGGATAA ATTAATATTA AACAAAGCAA ATGTATTAAG CACACTTGCT 354 0 

AACAACCCAG GTTTATCATG CATTGGTGTA ATTAAAAACA TCAATGATGT CGCATTTTGA 3 600 

so TCAAATTGCT GCTGATTTTT TATAACTAAA AAACGTGTCA CGTTATGTGG ATAGTCTTCA 3 66 0 

ATATGTGTAT CAATAGGTGT AAAACCATAA GctTCGCCAC TACCTAAAGG TGCAATTGCT 372 0 

GCAACGCCAT TTTCAATTTT AGTCAAACTT TGAATTGTAC TGTCGACATA ATCATAGTCA 378 0 
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TTTTTAATAT CAGAAATGGA ATCTGTTCCA TT AC CAT AT A ATGCAAAGTT AATATCTAAA 3 900 

CGTATTTCAC CGTGTGCAAA GACATCTTGC TGTGCAAGTG CATCTGCCAC AATGTTGATT 3 96 0 

5 

GTTCCTTCTA TAGAATTTTC AATAGGGACA ACACCAATCG ATGTGTCATC ATCTGCAACT 4 02 0 

GCCTTGATGA CTTCAAATAA ATTTGACTTT GGTTGAAAAG TTGCTTCATT TTCAGAAAAA 40 8 0 

TACTGACGAC AAGCCAAATA TGAAAATGTA CCTTTAGGGC CTAAATAATA TAATTGCATA 414 0 

w 

TGCTACACCT CTACTAACTT AATGATGGAA AGGGCACTGG TTAGCATTTG ATTCTTTCTT 4200 

TTTATAGAAA AAGTTTGGAT CTTTTACTGT ATTGTCATAT CCGTGATGAT AATTTGACGT 4 26 0 

is CAATGTTGGA GATAATGGCG GTGCTAGCCA AGACCATTTT CCGGTAACTT GACGACCTTG 4 320 

TTGTGCTTCG TTACGTTCGA ATAGTTCGAA TTGCTTTGCA GCGGTCAAAT GATCGACAAT 43 8 0 

TGATACGCCT TCTTTTTTAA AGGAATGATA CACAGCATAG TTCAATTCAA CAAGTGCTCG 444 0 

20 ATCTTTATTA AATGAATTAT TTTTAAGTGT ATCAAATTCA AACGCATCTG CAACTTTTTC 4 50 0 

TAGTAAATTG TAACGGTAAT CATCAATAAA GTTACGTACG CCAATTTCAG TTACCATATA 456 0 

CCAACCGTTA AAGGGTGCAG TTGGATATAC AATGCCACCG ATTTTTAAGT CCATATTGGA 462 0 

25 

AATGATAGGG ACTGCATACC ATTTTAAGTT CAATTTTCTT AATTTTGGAT AATGATTATG 46 80 

TTCAATAGGT ACTTCTTTAA TTAATGAAGT AGGATATTCG TAAAATTTAA CTGACTCATT 474 0 

3Q AGGTAATTGG TAAATCAGTG GTAACACGTC AAAATTAGTA CCTTTTCCTT TCCAACCTAA 4 800 

GTGATTTGCT AAGCGTGTAA CTTCTTTTTC AGCAGGATCA CCACAATTGT CATAGCCAGC 4 86 0 

ATAGCGAATT AATTGATTGT TGAAAATTTT AGGTCCATCC TTTGGAGCAT AT AT AG T AAT 4 92 0 

35 ATACGGCTTT AATTTACCTT CATTTGTAGC CTGTGTAATA TGATAAGTAA TTGATGATAA 4 980 

GAACGATGCT TCGTCAGTAA CATCTCTTGC ATCAATGACA TTTAACGAAT CCCAAAATAA 504 0 

ACGACCAATG CAACGATTTG AATTACGCCA AGCCATTTTA GCACCATAAA TAAGTTCTTC 510 0 

40 

TTCTGTATGT GTATATGTCC CAGTTTCTTT TATTTCTAGT TCAATGTCAT GTAAACGTTT 516 0 

ATTGATAATT TGCGT1TCAT AATGACACTC TTTATACATG TTTTCTATGA AAGCTTGAGC 522 0 

CTCTTTAAAT AACATTAACA ACACCTCGCT TTATATTATA GTCTACATTA TT AAAAT AC T 528 0 

45 

CTTAAAAATT ATGTATATGT CATTAAATTG TTGGTTGATT TTAATTAAAA GTATGGAAAT 534 0 

TAAGGGGCTC TTATGTATAT AAAAAAATGA ATTATGATAA AATGTAAGAA AATATTTAGG 54 00 

so TCGATTGGAG AGATACAAGT GTACCAATTA GAAGACGACA GTTTAATGTT ACATAATGAC 54 6 0 

TTATATCAAA TAAATATGGC TGAAAGTTAT TGGAATGATA ATATTCATGA AAAAATGGCT 5 520 

GTATTTGATT TGTATTTTAG AAAAATGCCA TTTAATAGTG GCTATGCTGT TTTTAATGGT 55 BO 
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TTAAAGTCTA TTGGCTACAA GGATGATTTC TT AT CAT ATT TAAAAGATTT AAAATTCACA 5700 

GGCAGCATCC GTTCGATGCA AGAAGGCGAA TTATGCTTTG GTAACGAACC ATTGTTACGC 5760 

s 

GTAGAAGCAC CATTGATTCA AGCGCAATTA ATAGAAACAA TTTTATTAAA CATTGTAAAT 5 82 0 

TTCCATACAT TAATTACAAC AAAGGCTAGC AGAATTCGTC AAATTGCATC AAATGATAAA 5880 

^ TTAATGGAGT TTGGTACACG TCGTGCGCAA GAAATTGATG CAGCATTGTG GGG CGCTAGA 5 94 0 

GCTGCTTACA TCGGGGGCTT TGATTCTACA AGTAATGTTA GGGCGGGGAA ATTATTTGGT 6 000 

ATACCTGTGT CTGGTACACA TGCACATGCA TTTGTCCAAA CTTATGGAGA CGAATATGTT 606 0 

/5 GCCTTCAAAA AATATGCTGA AAGACATAAA AATTGTGTGT TCCTAGTAGA TACATTCCAT 612 0 

ACTTTAAAAT CTGGCGTGCC AAATGCAATA AAAGTTGCAA AAGAATTAGG TGACAAAATT 6180 

AACTTTGTAG GTATTCGATT AGATTCTGGA GATATCGCTT ATTTATCTAA AGAGGCAAGA 624 0 

20 

CGTATGCTTG ATGAAGCAGG ATTTACTGAA ACTAAAATTA TCGCGTCTAA TGATTTGGAT 6 3 00 

GAAGAAACGA TTACGAGTTT GAAAGCACAA GGTGCAAAAG TAGATTCTTG GGGCGTTGGT 63 6 0 

ACAAAGCTGA TTACAGGATA CGATCAACCA GCATTAGGTG CAGTATATAA ACTTGTAGCT 64 2 0 

25 

ATTGAAAATG AAGATGGTTC ATATAGTGAT CGTATTAAAT TATCAAATAA CGCTGAAAAG 64 80 

GTTACGACGC CAGGTAAGAA AAATGTATAT CGCATTATAA ACAAGAAAAC AGGTAAGGCA 654 0 

30 GAAGGCGATT ATATTACTTT GGAAAATGAA AATCCATACG ATGAACAACC TTTAAAATTA 6600 

TTCCATCCAG TGCATACTTA TAAAATGAAA TTTATAAAAT CTTTCGAAGC CATTGATTTG 6 660 

CATCATAATA TTTATGAAAA TGGTAAATTA GTATATCAAA TGCCAACAGA AGATGAATCA 6 72 0 

35 CGTGAATATT TAGCACTAGG ATTACAATCT ATTTGGGATG AAAATAAGCG TTTCCTGAAT 6780 

CCACAAGAAT ATCCAGTCGA TTTAAGCAAG GCATGTTGGG ATAATAAACA TAAACGTATT 684 0 

TTTGAAGTTG CGGAACACGT TAAGGAGATG GAAGAAGATA ATGAGTAAAT TACAAGACGT 6 900 

40 

TATTGTACAA GAAATGAAAG TGAAAAAGCG TATCGATAGT GCTGAAGAAA TTATGGAATT 6 96 0 

AAAGCAATTT ATAAAAAATT ATGTACAATC ACATTCATTT ATAAAATCTT TAGTGTTAGG 702 0 

4s TATTTCAGGA GGACAGGATT CTACATTAGT TGGAAAACTA GTACAAATGT CTGTTAACGA 70 8 0 

ATTACGTGAA GAAGGCATTG ATTGTACGTT TATTGCAGTT AAATTACCTT ATGGAGTTCA 714 0 

AAAAGATGCT GATGAAGTTG AGCAAGCTTT GCGATTCATT GAACCAGATG AAATAGTAAC 72 0 0 

SO AGTCAATATT AAGCCTGCAG TTGATCAAAG TGTGCAATCA TTAAAAGAAG CCGGTATTGT 72 6 0 

TCTTACAGAT TTCCAAAAAG GAAATGAAAA AGCGCGTGAA CGTATGAAAG TACAATTTTC 7320 

AATTGCTTCA AACCGACAAG GTATTGTAGT AGGAACAGAT CATTCAGCTG AAAATATAAC 73 8 0 
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TAAACGACAA GGTCGTGAAT TATTAGCGTA TCTTGGTGCG CCAAAGGAAT T AT AT G AAAA 7 500 

AACGCCAACT GCTGATTTAG AAGATGATAA ACCACAGCTT CCAGATGAAG ATGCATTAGG 7 560 

5 

TGTAACTTAT GAGGCGATTG ATAATTATTT AGAAGGTAAG CCAGTTACGC CAGAAGAACA 7620 

AAAAGTAATT GAAAATCATT ATATACGAAA TGCACACAAA CGTGAACTTG CATATACAAG 76 80 

ATACACGTGG CCAAAATCCT AATTTAATTT TTTCTTCTAA CGTGTGACTT AAATTAAATA 774 0 

10 

TGAGTTAGAA TTAATAACAT TAAACCACAT TCAGCTAGAC TACTTCAGTG TATAAATTGA 7 800 

AAGTGTATGA ACTAAAGTAA GTATGTTCAT TTGAGAATAA ATTTTTATTT ATGACAAATT 786 0 

, 5 CGCTATTTAT TTATGAGAGT TTTCGTACTA TATTATATTA ATATGCATTC ATTAAGGTTA 7920 

GGTTGAAGCA GTTTGGTATT TAAAG TGTAA TTGAAAGAGA GTGGGGCGCC TTATGTCATT 798 0 

CGTAACAGAA AATCCATGGT TAATGGTACT AACTATATTT ATCATTAACG TTTGTTATGT 804 0 

20 AACGTTTTTA ACGATGCGAA CAATTTTAAC GTTGAAAGGT TATCGTTATA TTGCTGCATC 8100 

AGTTAGTTTT TTAGAAGTAT TAGTTTATAT CGTTGGTTTA GGTTTGGTTA TGTCTAATTT 8160 

AGACCATATT CAAAATATTA TTGCCTACGC ATTTGGTTTT TCAATAGGTA TCATTGTTGG 8220 

25 

TATGAAAATA GAAGAAAAAC TGGCATTAGG TTATACAGTT GTAAATGTAA CTTCAGCAGA 82 8 0 

ATATGAGTTA GATTTAC CGA ATGAACTTCG AAATTTAGGA TATGGCGTTA CGCACTATGC 834 0 

TGCGTTTGGT AGAGATGGTA GTCGTATGGT GATGCAAATT TTAACACCAA GAAAATATGA 84 00 

30 

ACGTAAATTG ATGGATACGA TAAAAAATTT AGATCCGAAA GCATTTATCA TTGCGTATGA 84 60 

ACCTCGAAAC ATACATGGTG GATTCTGGAC TAAAGGCATT CGTCGTAGAA AGCTTAAAGA 8 520 

35 TTATGAACCA GAAGAACTGG AAaGTGTAGT AGAaCATGAA aTTCmAAGTA AaTGAGAaTG 8580 

AAmCAATtGC TGATTGTTTG TCACGAATGA AAtGCAAGGG TATATGCCGG T AAAACG T AT 864 0 

TGAAAAACCC GTGTTTCAAG AGCAAAAAGA TGGCACGGTT GAAGTATCAC ATCAAGAAAT 870 0 

40 CGTTTTTGTA GGTAAGAAAA TCCAATAACA TAATCCAATT TAAATAAAGA CTATTTGAAG 8 76 0 

AGGAAAGGCT ATTCAAAGTT TGAGTAATTT TACTTTGAAT AGCCTATTTG TTTATACATG 8 82 0 

CAAGATGCTC GAT C CAT ATT GTATGAGAAA CCCCCAGCAA GCTATATAAA GCATATGCTG 8 88 0 

45 

GGGG TTCTTA ATATTTTAAA AATTATTGTT AGATTATATA TATCGTCGCT TTTTCTAAAA 8 94 0 

CAATCTCATC GCATGAAATT TTTTCTTCCT AG AGAC CTTT AATAAGATTA ATAGTTTACT 900 0 

5Q TAATCATATC TAGATAGTCT TATGACTTAT GCTTAATGAA AGTCATTCTA GGAGAAGTTC 9060 

CCAAAGCTTC TGTGTTCATA ATTGTTAGTA GTATTTTATT ATCATTTGGT ATAAATATTT 9120 

CAATAACAAT TGAGCTATTA TTTTTATTAT ATAATGTGAG TTGTTTGTGT TCTGTATTTA 9180 
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CATTTAAATC TTOAGGATGC CATTCTCCCT CAATAATATT AAGATAATAC TTAGCCTCTG 93 0 0 

AATTACATTT GAATTTATCA ATACTAAATA ATTCAATTTG TTCCATAATA TTATTTACCT 93 6 0 

5 TTCTAAAATA CAAATTTTAA TAACCATAAA TAGATGAATA CCATCGATAA TGGTCGCCAT 94 2 0 

TGGATACTGG AATAACATTG TTTTTAGCAT CTTGAGTCAT AAAACCATTA TCCCATGGAT 94 8 0 

TCCATATAAT TATAACCTCT TGTCCATTAT CTAATTTAGC GTTCCCAACA ACTGCCATGG 9 54 0 

w 

CATGCCCTGC GTGCATACCA TTTCTTGATT CT ACT CTACT ACCTAAAACA GCAATTCCTT 96 0 0 

TATTATTTTT AGTAAGATTG TCAACTTCAT TATATGTAGT CATTCTATTA AGAAGTTGTG 96 6 0 

15 GACTTCTTCC CTGAGTTTGT CCAAAATAAA TCATCTCTCT TGGCGTTAAA CCAGTAAATT 9720 

GGAATCGTTG TCCTTGTAAG TTTGGGTGTA AAAATCTCAT CACAGCTTCT GCATGATATT 9780 

TGTTAGTATT ATAAGTCGCA TTTAGTAATT CAGACATCGT ATAGCCTGCA CACCAACCAT 9 84 0 

2 0 TGTTACCTTG AGTTTCTCTT ATCTTGAAAT TCTCAAGTTT ATTTATATAT TGsTCGTTGT 9 90 0 

AAGTATAATT ATTACTTTTA AATTGACTAG TTGGCATAGT GACAGAAGCT TTTTGCTTTA 996 0 

GTTGCGTTAC ATTATTGCCA GTAGGTATAC TCTCAGTCTT TnTnAACTnT nTATCTTCTA 10020 

25 

GACGTGGTGT TTTTAGTACT AGTTTAGCTT TATGATTTTG AGTACCACAT AGTAACCTTT 100 8 0 

TGAGTTGT 10083 

(2) INFORMATION FOR SEQ ID NO: 33: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7563 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 
« (D) TOPOLOGY : linear 



r (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

■*o CGGAAACGnA CCCnATGCGT ATGCTTGACG TGCCAAAATT AAATACGAAG TT CATAGCTT 60 

TGAGGTACCA GAAGAACATT TATCTGGTCA AGAAGTCGCA GnACTCATAC AAGCAAATGT 120 

TAAAACAGTA TTTAAAACGC TTGTTCTAGA AAATACAAAA CATGAACATT TTGTATTTGT 180 

45 

TATCCCAGTA AGTGAAACTT TAGATATGAA AAAGGCAGCT GCTTTGGTTG GAGAGAAGAA 24 0 

ATTGCAGCTT ATGCCTTTAG ATAATTTGAA AAATGTAACG GG AT AGATTC GTGGTGGGTG 300 

TTCGCCTGTT GGTATGAAAA CATTGTTTCC AACAGTCGTT GACAAATCGT GTGAAAATTA 360 

50 

TAGTCATATC AGTGTGAGTG GTGGGCTTCG AACAATG CAA ATCACAATAG CTGTTGAGGA 420 

TTTGATTACA ATAACTAAAG GCAAAATTGG AGCAGTTATC CATGAATGAT TAATAACAAC 480 
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TGCCACACTC CTTTTTGATT GAATTAGCAT TTTACGATCA TAAACAGTCA TTATAATTGA 6 00 

GTATTTGAAC ATAAAAATGT AATTTT AT CG TAACAATTTG AGTGTTTGTG ATTGTTTTTG 66 0 

GTAATTTATG ATTGAAAAGT GAAAGCGTAC TCATTATAAT ACAAAGTGAG ATGGGGTGAT 72 0 

GATGATAATT ACTGaAAAAA GACACGAGTT AATATTAGAA GAACTTTC GC ACAAAGATTT 78 0 

TTTGACTTTA CAAGAATTAA TAGATCGAAC TGGTTGCAGT GCTTCAACAA TACGArGAGA 84 0 

TTTATCTAAA CTACAACAAT TAGGGAAATT GCAACGTGTG CATGGTGGTG CAATGTTAAA 900 

AGAAAATCGT ATGGTTGAGG CGAATTTAAC TGAAAAATTA GCAACGAATC TTGATGAAAA 96 0 

GAAAATGATT GCTAAAATAG CAGCTAATCA AATCAACGAT AATGAATGCT TATTTATCGA 1020 

TGCTGGTTCA TCTACATTGG AGCTAATTAA ATATATTCAA GCGAAAGATA TCATTGTGGT 10 80 

AACCAATGGT TTAACACATG TAGAAGCTTT ACTTAAAAAA GGTATTAAAA CAATTATGCT 114 0 

20 AGGTGGTCAA GTTAAAGAAA ATACACTTGC TACGATTGGT TCTAGTGCTA TGGAGATATT 1200 

AAGACGATAT TGTTTCGATA AAG CTTTTAT CGGGATGAAT GGATTAGATA TTGAACTTGG 126 0 

ATTAACTACT CCCGATGAGC AAGAGGCATT AGTTAAACAA ACAGCAATGT CATTAGCCAA 132 0 

TCAATCATTT GTACTTATAG AT CATTCTAA GTTTAATAAA GTATATTTTG CTCGTGTACC 13 80 

TTTGCTAGAA AGT A CG ACAA T CAT CA CATC TGAAAAAGCA TTAAAT CAAG AATCGTTAAA 14 40 

AGAATACCAA CAAAAGTATC ACTTTATAGG AGGGACTTTA TGATTTATAC AGTGACTTTC 1500 

AATCCTTCAA TTGACTATGT CATTTTTACG AATGATTTTA AAATTGATGG TTTGAACAGA 156 0 

GCAACAGCAA CATATAAATT CGCTGGGGGG AAAGGTATTA ATGTCTCGCG CGTCTTAAAG 1620 

ACATTGGATG TTGAGTCAAC TGCCTTGGGA TTTGCAGGTG GATTTCCTGG GAAATTCATT 16 80 

ATAGATACAT TAAATAACAG TGCAATTCAA TCGAATTTTA TTGAAGTTGA TGAAGATACA 174 0 

CGTATTAATG TGAAATTAAA AACAGGACAA GAAACAGAAA TCAATGCACC GGGTCCT CAT 18 00 

■*0 ATAACGTCAA CACAATTTGA ACAACTGTTA CAACAAATTA AAAATACAAC AAGCGAAGAT 1860 

ATAGTTATTG TTGCTGGAAG TGTACCAAGT AGTATTCCAA GCGATGCGTA TGCGCAAATT 192 0 

GCACAAATTA CAGCACAGAC AGGTGCTAAA TTAGTAGTCG ACGCTGAAAA AGAATTGGCT 1980 

45 

GAAAgCGTTT TACCATATCA TCCACTATTT ATTAAAC CT A ATAAAGATGA A TT AG AAG TG 2 04 0 

ATGTTTAATA CAACAGTGAA CTCAGACACA GATGTTATTA AATATGGTCG TTTGTTAGTT 210 0 

GATAAAGG TG CG CAATCTGT TATTGTCTCG CTTGGCGGTG ATGGTGCTAT TTATATTGAT 216 0 

SO 

AAAGAAATCA GTATTAAAGC AGTTAATCCA CAAGGGAAAG TGGTTAATAC AGTTGGCTCT 222 0 

GGTGATAGTA CAGTTGCAGG CATGGTGGCT GGAATTGCTT CAGGTTTAAC GATTGAAAAA 2 2 80 
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CGGGACGCTA TAGAAAAAAT AAAATCACAA GTTACGATTA GCGTACTTGA TGGGGAGTGA 24 0 0 

AAATAATGAG AGTAACAGAG TTATTAACAA AAGATACAAT AG CAATGG AT TTAATGGCAA 24 60 

ATGACAAAAA TGGTGTTATT GATGAGTTAG TAAATCAATT AGACAAAGCA GGTAAATTAA 2 52 0 

GTGATGTCGC GTCATTTAAG GAAGCGATTC ACAATCGAGA ATCACAAAGT ACAACTGGTA 2 530 

TCGGCGAAGG TATTGCCATT CCACATGCCA AAGTGGCCGC AGTTAAGTCA CCAGCTATTG 264 0 

CGTTTGGTAA ATCTAAAGCA GGCGTAGATT ATCAAAGTTT GGATATGCAA CCAGCACACT 27 00 

TATTCTTTAT GATTGcAGcG CCAGAAGGTG GCGCCCAAAC ACATCTAGAT GCTTTAGCTA 2760 

AGTTGTCTGG TATTTTAATG GATGAAAATG TACGTGAGAA ATTATTACAT GCTTCATCAC 2 8 20 

CTGAAGAAGT ACTAGCGATC ATAGATGAGG CTGATGATGA AGTGACAAAA GAAGAAGAGG 2 8 80 

CAGAAGCTGA AGCACAACAA GTTGCAACTG CAGAACAATC ATCTAAACAA TCTAATGAGC 2 94 0 

20 CATATGTGTT AGCAGTAACT GCTTGTCCAA CAGGTATTGC ACACACATAT ATGGCACGTG 3 000 

ATGCATTGAA AAAGCAAGCG GATAAAATGG GTATTAAAAT TAAAGTAGAA ACGAATGGTT 3060 

CAAGCGGCAT TAAAAACCAT TTAACTGAAC AAGATATTGA AAATGCAACA GGTATCATTG 312 0 

25 TTGCTGCTGA TGTTCATGTT GAGACGGATC GCTTCGATGG TAAAAATGTC GTAGAAGTAC 3180 

CAGTAGCAGA TGGTATTAAA CGCCCAGAAG AATTAATTAA TAAAGCATTA GATACAAGTC 3 24 0 

GTAAACCTTT TGTTGCCCGT GATGGTCAAA GAAAAGGTAA CTCAAATGAC AGTCAAGAAA 3 3 00 

AATTAAG CCC AGGTAAAGCA TTCTATAAAC ACTTAATGAA CGGTGTTTCT AACATGTTGC 3360 

CACTTGTAAT ATCTGGTGGT ATTTTAATGG CAATTGTATT TTTATTTGGA GCAAATTCAT 34 2 0 

TTAATCCAAA AAGCTCAGAG TACAATGCGT TTGCAGAGCA GCTTTGGAAC ATTGGTAGTA 3480 

AAAGTGCATT CGCGTTAATC ATTCCAATTT TATCTGGATT CATTGCACGT AGTATTGCGC 3 54 0 

ATAAACCTGG TTTCGCTTCA GGTCTTGTAG GTGGTATGTT AGCAATTTCA GGTGGTTCAG 3 6 00 

40 GATTTATTGG TGGTATTATT GCAGGTTTCT TAGCAGGTTA CTTAACACAA GGTGTTAAAG 36 6 0 

CCATGACACG TAAGTTACCA CAAGCATTAG AGGGATTAAA GCCAACATTA ATTTATCCAC 372 0 

TATTAACAGT GACGGCTACA GGCTTATTGA TGATTTATGC CTTTAATCCA CCAGCATCTT 37 8 0 

45 GGTTAAATCA TTTGTTATTA GATGGATTAA ACAATTTATC AGGTTCTAAT ATTGTATTAT 3 84 0 

TAGGTTTAGT TATTGGCGCT ATGATGGCGA TTGATATGGG CGGTCCATTC AACAAAGCGG 3 90 0 

CATATGTTTT TGCAACAGGT GCGTTGATTG AAGGTAATGC AGCACCAATT ACAGCTGCAA 3 96 0 

50 

TGATTGGTGG TATGATTCCA CCGTTAGCAA TTGCGACAGC G ATG TTAATT TTTAGACGTA 4 020 

AATTTACAAA AGAACAACGT GGTTCAATTA TCCCTAACTA TGTGATGGGT ATGTCATTTA 4 080 



30 
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TGATTGGTTC AGGTATAGGT GGCGCAATTG 
CACATGGTGG TATTATTGTA ATTGTTGGTA 
3 TTGCACTTCT AGTTGGCACA TTAGTTTCAG 

TAACTGAAAC AGAAATCGAA GCTTCAAAAT 
TGATTGTTAG CAAAGAGCTT CATATTAAGT 

w 

TATATCGTGT TAACGGTAGC TTATACAAAG 
TTATGAATTG AT ATGAAAG T GTTTTTATTT 
CAAATGTATA GACTTTTTTA ATATTTTGCA 
AAAATATGAG TGTCTTAAAG TGAAAATTTA 
TTAATTATAT ATAACGGCAA AGTTTATACT 
20 CATGTGAAAG ATGGACAGAT TGTTGCAATT 

AATGATACGA CAAATAAAAT TCAAGTGATT 

TTTATTGATA TACATATTCA TGGTGGTTAT 

25 

GGCTTAAAAT ATCTATCCGA AAATTTGTTG 

ACAATGACGC AATCGACTGA TAAAATAGAT 

G CGG AgCAAG ATGTTCACAA TGCAGCGGAA 

30 

ATATCTGAAA ATAAAGTTGG TGCTCAACAT 
AAAATTAAAC ATTTTCAAGA GACTGCTAAC 

35 GAAATTGAAG GTGCAAAAGA AGCGCTTGAA 

GGTCATACAG TAG CAACATA CGAAGAAGCA 
GTCACGCATT TATATAATGC AGCGACGCCA 

■*o GCAGCATGGT TGAATGATGC TCTACATACC 

CCGGCATCGG TTGCAATTGC TTACCGTATG 
GATGCAATGC GTGCAAAAGG TATGCCTGAA 

45 ACTGTTCAAT CGCAACAAGC ACGTCTTGCA 

ATGAATCATG GGTTACGTAA CTTAATATCA 
CGAGTAACAA GTTTAAATCA AGCCATTGCA 

50 

AAAGTAAATA AGGATGCAGA TCTTGTTATT 
ATAAAACAAG GCAAGGTTCA CACATTTAGC 

55 



CTTTAGGCTT AGGTTCACGA ATTACTGCGC 4 2 00 

CTGATGGTGC ACACTTACTT CAAACTCTTA 426 0 

CATTAATTTA CGGTTTAATC AAACCAAAGT 4 32 0 

CAATGGACGA GTAGTTTTAA TGATGTAAAA 4 3 80 

TGTATGTTCA ATGAATATAT GTTAGTTTTA 4 4 4 0 

CTGTAAAAAC ACTTTCTATT AATTCAGTTT 4 500 

TTAGATAAAT GAATGAAGAA ATAGACACCA 4 56 0 

AAAAGTTATG CCAAACGAAG CAGATATAGT 4 620 

TAAATAAAGA AGGGTTTATA CGTGTCAGAA 46 80 

GAAGATGGCA AAATCGATAA TGGTTACATT 4 74 0 

GGAGAAGTGG ATGATAAAGC AG CAATTG AT 4 8 00 

GATGCTAAAG GTCATCATGT ATTACCAGGT 4860 

GGTCAAGATG CAATGGATGG GTCATACGAT 4 920 

TCTGAAGGGA CGACATCATA CTTGGCCACT 4 980 

AATGCACTTA CAAATATTGC TAAATATGAA 504 0 

ATTGTAGGTA TACATTTAGA AGGACCATTT 5100 

CCG CAATACG TTGTACGCCC ATTTATCGAT 5160 

GGATTAATAA AGATTATGAC GTTTGCACCT 5 2 20 

ACGTATAAAG ATGACATTAT TTTTTCAATT 52 8 0 

GTTGAAGCTG TTGAGCGAGG AG CT AAA CAT 534 0 

TTC CAACATA GAGAACCAGG TGTTTTTGGA 54 00 

GAAATGATTG TTGATGGCAC TCATTCTCAT 54 SO 

AAAGGTAATG AACGTTTTTA TTTAATTACC 5 520 

GGAGAATATG ATTTGGGTGG ACAAAAAGTA 5 5 BO 

AATGGTGCGC TTGCTGGTAG TATTTTAAAA 5 64 0 

TTTACAGGTG ATACATTAGA TCATTTATGG 5 7 00 

TTAGGTATCG ATGATAGAAA AGGTAGTATT 5 76 0 

CTAGATGATG ATATGAATGT AAAATCTACA 5 82 0 

TAATAAATAA TCATAATTAA ATGTATGCAA 58 8 0 
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TTTTCTGGGG GTGTCTAAAT GGGAAGGCGA TAACATGTAG TTGTAATTTA AGTCATAGTG 6 00 0 

ATAAATTTGA ATGCGTGTTA CCCATGAGTG ACACATATAA CATGGAGGTG AATCCCTAGA 6 06 0 

5 AATAGGGAAT TAATTGGAAA CTTCGACCAT AATTAGTTTG ATTATATTTA TTCTATTAAT 612 0 

TGCATTAACC ACTG T ATTTG TTGGTTCAGA ATTTGCATTA GTAAAAATTA GAGCAACAAG 6180 

AATTGAACAG CTAGCAGATG AAGGAAATAA ACCTGCTAAA ATAGTAAAAA AGATGATTGC 624 0 

w 

TAATCTAGAT TATTATCTTT CTGCTTGTCA GTTAGGTATA ACAGTAACAT CTTTAGGGTT 6 30 0 

AGGTTGGCTT GGTGAACCAA CGTTTGAAAA GCTATTACAC CCAAT ATTTG AAGCAATCAA 63 6 0 

TTTACCAACT GCATTAACGA CGACGATTTC GTTTGCAGTG TCATTTATAA TCGTTACGTA 64 20 

TTTGCATGTA GTACTTGGTG AATTAGCGCC TAAATCTATA GCTATTCAAC ATACTGAAAA 64 80 

GCTTGCTTTA GTATATGCAA GACCATTGTT CTATTTCGGT AACATTATGA AACCATTGAT 6 54 0 

20 TTGGCTGATG AATGGTTCTG CACGTGTTAT TATTAGAATG TTTGGTGTAA ATCCTGATGC 660 0 

C CAAACTG AT GCAATGTCAG AAGAAGAAAT CAAAATTATT ATTAACAATA GTTATAATGG 666 0 

TGGAGAAATC AAC CAAACTG AATTGG CAT A TATGCAAAAT ATCTTTTCAT TCGATGAAAG 6720 

25 ACATGCAAAA GATATAATGG TACCTAGAAC TCAAATGATT ACACTAAATG AACCTTTTAA 67 8 0 

TGTAGACGAA TTACTAGAAA CAATAAAAGA ACATCAATTT ACGCGTTATC CAATTACTGA 6 84 0 

TGATGGTGAT AAAGACCACA TTAAAGGATT TATTAACGTC AAAGAATTTT TAACTGAATA 6 900 

30 

CGCTTCTGGA AAAACGATTA AAATAGCAAA CTATATaCAT GAGTTG CCAA TGATTTCAGA 696 0 

GACAACACGT ATCAGTGATG CATTAATTAG AATGCAACGT GAACATGTAC ATATGAGTCT 7 020 

35 TATTATAGAT GAATATGGTG GAACGGCAGG TATTTTAACG ATGGAAGATA TTTTAGAAGA 7080 

AATCGTTGGA GAAATTCGTG ATGAATTTGA TGATGATGAA GTGAATGATA TCGTTAAAAT 714 0 

TGATAATAAG ACATTCCAAG TAAATGGCAG AGTACTATTG GATGATTTAA CTGAAGAGTT 7200 

40 CGGTATAGAA TTTGATGACT CTGAGGATAT TGATACGATA GGTGGATGGT TACAATCTCG 7260 

TAATACCAAT TTACAAAAAG ATGATTACGT GGATACAACT TATGATCGCT GGGTTG TTTC 73 2 0 

AGAAATCGAT AACCACCAAA TTATTTGGGT GATATTAAAC TATGAATTTA ATGAAG CG AG 7 3 80 

45 ACCTACTATC GGACAGTCTG ATGAAGATGA AAAATCAGAA TAGATATTAA TATATAAACC 74 4 0 

AACTAAGAAT GATTTAATTC ATTTTTGGTT GGTTATTTTT TTGACTAAAA TTAAnGAAAA 7 500 

GTGAAAATAG TATTGGAACT CAATATCTTT AATGATTTAA TGAATAAnTT TTATTGAAAG 7 560 

50 

CGA 7563 
(2} INFORMATION FOR SEQ ID NO: 34: 
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20 



(A) LENGTH: 3492 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 34: 

TTATATCAAC TTCATGGCGG AACCATTGAT GACCCATTAG ACGAAACAAT AAGCGCATTT 6 0 

sATGAATTGA AACAAGAAGG AATTATACGT GCTTACGGTA TTTCTTCTAT TCGCCCAAAT 12 0 

GTAATTGATT ATTATTTAAA ACATAGTCAA ATCGAAACGA TAATGTCTCA ATTCAATTTG 180 

ATTGATAATC GTCCAGAATC ATTATTAGAT GCAATTCACA ACAATGATGT TAAAGTATTG 24 0 

G CAAGAGGAC CTGTGTCTAA AGGATTATTA ACTTCAAACA GTGTTAATGT GCTCGACAAT 300 

AAATTTAAAG ATGGTATTTT TGATTATTCT CATGATGAAT TGGGTGAAAC AATAGCCTCT 360 

ATTAAAGAAA TTGAAAGTAA TTTATCTGCA TTGACATTTA GTTATTTAAC ATCACATGAC 4 20 

GTGCTTGGTT CCATCATTGT AGGTGCAAGT AGCGTCGACC AATTAAAAGA AAATATTGAA 4 80 

AACTATCATA CTAAAGTTAG TTTAGATCAG ATTAAAACAG CAAG AG CTCG TGTAAAGGAT 54 0 

25 

TTGGAATATA CCAATCATTT AGTGTAGAAG TCATTTTCAG TAATAAAAAC AGCAGCATGA 600 

GGCGTTTCAT TATAAAAATG CCTTACTGCT GTTGTTTATG TACAATTCGC TATAATTTAT 660 

30 GATTATGATT ACTCACTTAT GATAGAAATT AAAGCGTTGT CCTCACGCAT CAGTATTTAG 72 0 

TAATTTCGCC TTGCGGCATT GCCTTAAGCA AACTTCTGCC ACTTCAT CTC TTAATAATTT 7 80 

TATTAAAACA TCTTTCTATA TTTCACTTCG CATGTTGATT CATCATTATT AGTTATTATT 84 0 

35 TGTACACCCA GCACATTTCC TTGCAACACA AGTAGTTTGA ATTTTTCACA AGTATAATAT 900 

AATGTACCGT CTGAAATTTG GTCTACAGAA AT AT CG CCT A AAATATCCAG CACTGTAAAT 96 0 

TCTTCAAATA CTGATAGTTG TTCCGCATAT CGTACACAAA GTCTTACCAC ACTCTCCGAT 102 0 

TGACAGTTCA TTGCCATCCC ACCTATTTAT GCTTTATTTT TAAATAATTT AGGGAAACAT 108 0 

CGTTCAAAAA ATCTAGGCGC AATTTGATAC ATTTTCAACG CATGaTG CAT CCATTTAGGC 114 0 

CGATTAATTT CCAATTGTTT TGTTTTAATG CCATAAATGA TATCTTCTGC AAGCTGATTA 12 0 0 

GCATCAAGCA TAATTTCCCC CATCTTTTTA gCATACTTCA TTGATGGGTC GGCTTTTTGA 126 0 

TGAAAAGGTG TATCAATCGG GCCAACATTA ACTGTCATGA TATGTAAGTT TGGTGACTCT 13 20 

AGTCTTAAAG CATTCATTAA TGCATAAAAC CCTGCTTTCG ATGCCCCATA ATGTGCAGCA 13 80 

TTTGCTTGTG TGGAAAATGC AGCTTGACTT GAAATACCTA CAATATGTGC GTTAGATGTT 14 4 0 

AAATATGGTC TCAACACAGT ATATAAAACA TTAAAACTAA TTAAATTAAG CTGATACGTT 150 0 
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TAAATGAATC CATCGAAT3A TGTATTGTCT TCAAATI'GCA GTGCCTGTAT CGACTTCAAA 16 20 

TCATTTAAGT CACAAGGAAT AACATTTATA GTTTTCCCCA ATTCCTGTTC AAAGATTCTA 16 80 

5 

GTTGCTTTAT CAACATCACG CACCAACAAC GTTACATGCA CTTTATTTTC TAGTAACTTT 174 0 

CGGACAATCG ATAAACCTAA ACCACTCGTA CCACCAGTCA CTATAAAATG TTGTCCTTTC 1800 

ATCAA7TAAC CTTCCTTTTC AATTATATAG AATGCAATTT AT CAACTTT A CATAATTGAG 186 0 

w 

ACAAGTTGAT TATCTTTCCT AATATATATA CAATAATAAG AAAATATAAC ATACAAATCA 192 0 

AAAACTAAAG GGATGTGaCG TTAATGrAAC TCGTATTTTA TGGAGCTGGT AATATGGCAC 198 0 

1S AAGCTATATT TACAGGrATT ATTAACTCmA GCAACTTAGA TGCCAATGAT ATATATTTAA 2 04 0 

CAAATAAATC TAATGAACAA GCTTTAAAAG CATTCGCTGA AAAACTAGGT GTTAACTATA 210 0 

GTTATGAtGA TGCGACATTA TTAAAAGATG CAGAyTATGT ATTTTTAGGT ACCAAACCAC 216 0 

20 ATGACTTTGA TGCTCTAGCA ACACGCATCA AACCACATAT TACAAAAGwC AATTGCTTCA 2220 

TTTCAATTAT GGCAGGTATT CCGATTGATT AT ATT AAA CA ACAATTAGAA TGCCAAAATC 22 8 0 

CaGTTGCTAG AATTATGCCA AACACAAATG CGCAAGTTGG ACACTCTGTT ACTGGCATTA 234 0 

25 

GTTTTTCAAA CAACTTTGAC CCTAAATCTA AAGATGAAAT TAACGATTTA GTTAAAGCAT 24 0 0 

TTGGTTCTGT AATTGAAGTA TCAGAAGATC ATTTACATCA AGTAACAGCT ATCACCGGAA 24 6 0 

GCGGCCCAGC ATTTTTATAT CATGTATTCG AGCAATATGT TAAAGCTGGT aCsAAACTTG 2 52 0 

30 

GTCTAGAAAA AGAACAAGTT GAAGAATCTA TACGCAACCT TATTATAGGT ACAAGTAAGA 25 80 

TCATTGAACG TTCAGAtTTG AGCATGCCTC AATTAAGAAA AAATATTACC TCTAAAGGTG 2 64 0 

35 GTACGACACA AGCTGGCCTT GATACATTGT CACAATATGA TTTAGTATCT ATTTTCGAAG 2 70 0 

ATTGTCTAAA CGCTGCCGTC GACCGTAGTA TTGAACTTTC TAATATAGAA GAC CAATAAA 2 76 0 

AACAAACCCG CCAACACATG TATGCATCAT CGCAAGCACT GTGTTTGACG GGTTATTTTT 2 82 0 

40 ATAATTTATT GTTATTTGGC AAGCATTGTT TATTACTTTG TCATTAGATT TTAAAACTAT 2 88 0 

CAAAATCTTT TACAAAATTA AAATTAGGTG TATCTTCATT TTGTATCAAT GTTTGATAAA 2 94 0 

TTTCATTTAT ATCTTCTGTA TTATAGCGAT TGCTCAAATG TGTAATCAAC GTACGTTTAA 3 000 

45 CATTGGCTTC TTTTATCAAT GCAAATACGT CTTCAATATG GCTATGATGA TAATTGTTGG 3 06 0 

CTAAATGCTT TTCACCATCT ATATAGGTCG CTTCATGTAC CATCACATCA GCATCTCTAG 312 0 

AAATCACACG TTCATTAGAA CATGGTTTTG TATCACCAAA AATTGCTACA ACTGGACCCT 3180 

50 

GTTTGGACTC ACCTCTAAAA TCTTTTGATT GATAAACTTG ACCATTATGT TCAAATGTAT 3 24 0 

CATGAGATTT TACTTCTTGA TATTTAGGAC CTGGTTCAAG ACCAATGTTT TTTAACGCTT 3 300 
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CATGATTAAG TAAATGCGCC TCTACAGTAA AAC CATC CAT GATGATATGT CAGATGATCA 34 2 0 

TCGATTTCAA TATATGtAAT TGGATAGTTT AAATGTGACT CTGATAAATT CATAGACATT 34 8 0 

* TCCACATATG CT 34 92 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS : 
w (A) LENGTH : 1973 base pairs 

{3} TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

is 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
ATCTAGCGGT ACAAGCGTCT TGGAGGCTAG TATGTTGAAC ATTGTAAACC CTGAAGATCA 6 0 

20 CTTCGTTGTC ATTGTTTCAG GTGCCTTTGG TAACCGATTT AAACAAATTG CACAAACTTA 12 0 

TTACAAAAAT GTGCATATTT ATGACGTAAC ATGGGGAGAA GCTGTAGATG TCAAAGATTT 13 0 

CATCAATTTC CTTTCAACTT TAAATGTTGA AGTTAAAGCA GTATTTAGTC AATATTGCGA 24 0 

25 AACATCTACG ACAGTGCTAC ACCCTATTCA CGAGTTAGGA AATGCCATTA ATCAATTTAA 3 00 

TAGTAATATT TATTTTGTAG TTGACGGCGT AAGTtGCATT GGTGCTGTTG ATGTTGACAT 360 
TAACAAAGAT AAAATTGATG TACTTGTTTC TGGTAGTCAA AAAGCAATTA TGTTACCTCC 42 0 

30 

AGGATTAGCT TTTGTAGCTT ATAGCCACCG TGCAAAAGAA CATTTCAAAG AAGTAACTAC 4 80 

GCCAAAATTT TAT CT AG ACT TAAATAAATA CATTTCGTCA CAAGCTGACA ATTCTACACC 54 0 

GTTCACACCA AATGTGTCTT TATTTAGAGG TGTAAATGCA TACGTTGAAA CCGTAAAAGC 600 

35 

AGAAGGTTTC AATCACGTAA TAGCACGACA CTATGCAATT AGAAATGCAT TAAGAAGCGC 650 
CTTAAAAGCA TTAGATTTAA CTTTATTAGT CAATGATAAA GATGCATCTC CAACGGTTAC 72 0 

40 AGCATTCAAA CCTAATACAA ATGATGAAGT GAAAATAATC mAAGATGAAC TTAAAAATnG 7B0 

CTTTAAAATA ACAATTGCnG GTGGTCAAGG CCATCTTAAA GG TCAAATTT TnAGAATTGG 840 
TCATATGGGG AAAATTAGTC CTTTCGATAT TTTATCGGTA GTATCTGCTT TAGAAATTAT 90 0 

45 TTTAACTGAA CACCGTAAAG TTAACTATAT CGGTAAAGGT ATATCAAAAT ATATGGAGGT 96 0 

TATTCATGAA GCAATTTAAT GTACTCGTTG CAGATCCCAT ATCAAAAGAT GGTATCAAAG 102 0 

CATTATTAGA TCACGAACAA TT C AATG TAG ATATTCAAAC TGGCTTGTCC GAAG AAG CAT 10 3 0 

50 

TAATCAAAAT TATACCTTCA TACCATGCTT TAATCGTTCG TAGTCAAACT ACGGTTACTG 114 0 

AAAATATCAT AAATGCTGCT GATTCTTTAA AAGTAATCGC ACGCGCCGGT GTTGG TG TAG 12 0 0 
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GTAATACGAT TTCAGCTACT GAACATACAC TGGCAATGTT ATTATCAATG GCACGAAATA 1320 

TTCCGCAAGC ACACCAATCA CTTACAAATA AAGAATGGAA TCGAAATGCA TTTAAAGGTA 13 80 

CTGAGCTTTA TCATAAAACA TTAGGTGTCA TTGGTGCTGG TAGAATTGGT TTAGGTGTTG 144 0 

CTAAACGTGC GCAAAGTTTC GGAATGAAAA TACTAGCTTT TGACCCTTAC TTAACGGATG 1500 

AAAAAGCAAA ATCTTTAAGC ATTACGAAGG CAACAGTTGA TGAGATTGCC CAACATTCTG 156 0 

w 

ATTTCGTTAC ATTACATACA CCACTAACAC CTAAAACAAA AGGCTTAATT AATGCTGTCT 162 0 

TTTTTGCCAA AGCAAAACCT AGTTTGCAAA TAATCAATGT GGCACGTGGT GGTATTATTG 16 80 

/5 ATGAAAAGGC GCTAATAAAA GCATTAGACG AAGGACAAAT TAGTCGGGCA GCTATCGATG 1740 

TGTTTGAACA TGAACCTGCA ACTGA CTCGC CTCTTGTTGC ACATGATAAA ATTATTGTTA 1800 

CACCTCATTT GGGTGCTTCA ACAGTCGAAG CTCAAGAAAA AGTGGCAATT TCTGTTTCAA 1860 

20 ATGAAATCAT CGAAATTTTA ATTGATGGTA CTGTAACGCA TGCAgTGAAT GCACCTAAAA 1920 

TGGACTTAAG CAATATAGAT GATACTGTAA AATCATTCAT CAATTTAAGC CAA 1973 
(2) INFORMATION FOR SEQ ID NO: 36: 

25 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 7620 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

30 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GGTGTTTCAG ATGTCACTGG TTGATTTTTA ATTGTAGACG GGTATTTTGG GCTTTCGCCA 60 

35 

TATTTATTTG CCGGCTTACT GTCAAAGCAT AGGAATACTA TCATAACAAT TGTTAGGCCT 120 

AAAT$AACAA AATAAAGAAG TACTAACAAA ATATTAAGAC CCATCGGCAT TAATGTAAAA 18 0 

40 TCACTGTCAT AATAACTATC GATAATCTGT AATACTATAT AAAATATAAT ACTGAATACT 24 0 

GTCATAATCA TTGGAAATAA CATTGTTCTT GATATATCGT GAAATCTTCG AACGCACAAC 3 00 

GCTAAATTTG GAATAAACGT TGCCAAACTA TAGACAAAAG TATACACAGA TGTAAGGATA 3 60 

45 ATCATCAATA TACTCATAAC TATTAATGTT TCGTTATCCG CCGCTATAGA AATAAAGAAT 4 20 

AGAAATAGGT TTATTATTAG CACACACACA GCTGGAACCA TAAGTATCAA ATGCCATAGT 4 80 

GCCATATACC AATATTCACT ACGTCTTGAT CTCCCCTTAA AATTTACATA ATTTTTCCAA 54 0 

50 

AATAAAACGA ATGATTTCAT AAAACCTACT TGAGGTAATT GTTCCATTGT AATCTCCCTT 6 00 

TCGTTAATCA TATTTATATT TTTAATTATT GTTACCGTTA TAATTTACAA GATTCATTAT 6 60 

55 



329 



EP 0 786 519 A2 



GTAAAATGAA AACCCGCTAC AAGTACACAT 
TTCGTTAACT ATACTAAAAA TATGTCATAC 
CTATGCAAAT AAAATATTCC ATAACAAAGT 
ATACTATTTT ATCAAACATT TACCACAATA 
A7CATATAA7 TGCGAGGAGA ATATTATGGA 

w 

TAAAAACTTA TTAGGTGTCA AAGTGATTTA 
CATCGTGGAA ACGGAAGCTT ACTTAGGTTT 
TAAAATAACA CCTAAAGTCA CGTCATTATA 
CATGCATACG CATTTACTCA TTAATTTTGT 
ACTTATCCGC GCAATTGAAC CAGAAGAAGG 

20 GAAAGGCTAC GAGGTAACGA ATGGCCCAGG 

GGCTATCGAT GGCGCTACGT TAAATGACTG 
ATATCCTAAA GATATTATTG CTAGTCCACG 

25 ACATAAATCT TTACGTTACA CAGTGAAAGG 

AGATTGTATG TTTCCCGAAG ATACTTGGAA 
TGAAAATGAA ATCTATCTCC TTATAAGTCA 

JO 

TGATTGTTTT TCTTTGTATC CATCATATTT 
CTTAATTATA AAATATAACA ATAGAATTAT 
TTATTGATAT TATTTTCAAA AACTAGAAAT 

35 

CGCCCTTTTA TAACGCTTAC ATATAAAAGC 
GGATTTGAAA ATGATAGAAC TTAATGCAAT 

40 TTTACTTGGT AAGGCTATCG TTAATCACGT 
ACCAGTGATT GGCGG CTTAA TCTTTGCTAT 
GGTTAAGATT AAATTAGATG CTTCATT CAT 

45 GACAATCGGT CTTGGTGCAT CATTGAAATT 
ATACTTTA7G TTTTGTGCTA TCATTTCAGT 
AAAAGTATTA AATATTAAAC CTTTGTTAGG 

50 

CGGTCATGGT AATGCTGCTG CTTATGGTAA 
ACTGACAGCG GCTCTTGCAG CTGCAACTTT 

55 



CTATATGGAG ACTCATTTGA AAGTCAACGC 780 

TGCAATGTTC ACGTTTAAAA GAGTCTCAAT 84 0 

ATATACTTTA CATTTTTATA ATTCTTAACA 900 

AAAATATCTT TTTCATTTTT ATTTAAATTA 950 

TTTCGTTAAT AATGATACAA GACAAATTGC 102 0 

TCAGGATACC ACTCAAACGT ATACAGGCTA 10 80 

GAATGATCGT GCGGCTCATG GCTATGGCGG 114 0 

TAAACGTGGT GGTACAATTT ATGCACATGT 120 0 

AACAAAATCT GAAGGTATAC CTGAAGGCGT 126 0 

TTTATCCGCT ATGTTCCGTA ACAGAGGTAA 132 0 

AAAATGGACT AAGGCATTTA ACATTCCACG 138 0 

TAGATTGTCT ATTGATACTA AGAATCGTAA 144 0 

AATCGGTATT CCAAATAAAG GTGATTGGAC 150 0 

TAATCCATTT GTGTCTCGCA TGCGTAAATC 156 0 

ATAAATGCCA TCTTTCATTG ATTACTATCA 162 0 

ATCAATCGTG CCGTCAACAT GCGGATGGGT 16 8 0 

TTTGATTCAT CTCCTCTTAT TGAACTTGTT 174 0 

TTATAATTAT TAAATTTAGA TGCATTAATA 18 0 0 

ATTGATTTGT TGCATGTATA ATGTTAAAAG 186 0 

TTATTTAGGG AGAGGGATAT TCAACAAGGG 192 0 

TACAACATTA TGTTTAG CTT GTATCCTTTA 198 0 

TAATTTTTTA AAACGTATTT GTATACCAGC 204 0 

TTTAGTTGCG GCTTTGGATT CATTTGGCAT 210 0 

TCAAGATTTC TTCATGTTAG CATTCTTTAC 216 0 

ATTTAAATTA GGTGGCAAAG TCTTGCTATT 222 0 

CATTCAAAAC ATAGTTGGTG TATCACTAGC 22 80 

ATTAACAGCA GGTTCCATGT CTATGGAAGG 2 34 0 

GACAATTCAA GATTTAGGTA TTGATTCGGC 240 0 

AGGTCTTGTA TTTGGAGGGC TTATCGGTGG 24 6 0 
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ATTTAAAGAT TATAGCCAAG TAGCATATAA CGAACA7TTA CATAGTAAAT TTAATGCCAC 2 58C 

TGAAGTATTC TTCATTCAAT TTACAATCGT TGTATTCTGT ATGGCAGTTG GAAGTTATTT 2 64 0 

5 

CAGTCATTTG TTTACAGCTC AAACAGGGAT TAATGTTCCA ATTTACGTTG GCTCATTATT 2 700 

TGTAGCTGTT ATTGTCCGAA ATATCTCTGA AAGTTTTAAT TTTAATATTG TAGATTTAAA 2 76 0 

AATTACTAAT GAAATTGGCG ATGTCGCATT AGGTATTTTC TTATCTCTTG CGCTAATGAG 2 32 0 

w 

CATTCAATTA ATCGAAATTT AT AAACTTG C TATACCTCTT ATTATTATCG TTTTAGTTCA 2 860 

AGTTGTCGTT ATGATTTTAT TTGCTGTTTT AATTTTATTT AGAGGTTTAG GAAAAGATTA 2 94 0 

j 5 TGATGCTGCA GTAATGGTAG GTGGTTTTAT CGGTCATGGG CTTGGTGCAc GCCAAATGCC 3 00 0 

ATGGCAAATT TAGATGTTAT TACTAAAAAA TATGGAAACT CACCTAAAGC ATATTTAGTT 3 06 0 

GTACCTATTG TTGGTGCATT CTTAATCGAT TTAATTGGTG TTATAGTCAT TATGGGATTC 312 0 

20 ATACAATGGT TTAGTTAAAC ACCAAACTCA TAAATAAAAG AGGAGGCCTT CGCCTCcTcT 3180 

TTTATTTATC CTCGATGTAT ATTCAAGTTA CGTTGTTCTA TCCATGACAA T ATTTCCG G A 3 24 0 

CTAAATACGA TTTGTTTTTG TGTTAAGTCG TCAATATTTT TAGCATCTAA CATCGTCATT 3 3 00 

25 

ATTGATTTCA TGTGTTCAAT AAATGATTCT ACATAAGCTA CTGTATGTGC AATGCCATTA 33 6 0 

TTTTCAACTT GATTTAAAAA CGGACGTGAC ATACCAGTTG CCTTTGCACC AAGT G CT AAA 34 2 0 

CTTTTAATTG CATCGAGTGG TGTACGTAAA CCACCACTCG CGAAAACTGA AATTTCGCTT 34 80 

30 

TGATAAGCCG TTGTTTCAAG TAATGACTCA ACTGTAGACT GTCCCCATGA TGATAAGTAA 3 54 0 

TCCATATCTT TATTTGCACG ACGTTCATTT TCAATATCTA CAAAGTTAGT ACCACCTTTG 360 0 

35 CCACTAACA? CGACATACTT GACGCCTATT TGTTGTAAGT CATGCATTAA TTCTTTGCTC 36 6 0 

ATACCAAATC CAACTTCTTT TATAATGACT GGAACAGACA CTCGTGATAC AATCGACGCT 372 0 

ATATTTATCTA ACCAAGTCAC AAATTCACGA TTCCCTTCAG GCATAACTAA TTCTTGAGGA 3 78 0 

40 GAATTAACAT GGATTTGTAA CGCTTGTGCC TCAAGTAATT CAACTGCTTC CAAAGCCTTT 3 84 0 

TCTACTGGTA CGTCCGCACC AACATTGCTA AAAATCATGC CTTCAGGATT CATTTTTCGC 3 900 

GCAATCGTAA ACGTCTCAGC CATGCGTGGA TTTCTCAATG CCGCATGTGT TGATCCAACT 3 96 0 

45 

GCCATCGCTA AGCCAGTTTC TCTTGCAACT ACAGCTAGCT TTTCATTGAT GTTTTTCGTC 4 02 0 

CACTCGCTAC CACCCGTCAT TGCATTAATA TAAAC CGG AT ATGCCATCGT TAAGTCAGGC 408 0 

GTCTGTGATG TCAAATCGAT A TCATTT ACA TTAATTGATG GGATAGAATG ATGCACAAAA 414 0 

50 

CGCATCTTAT CAAAATCTGA ATGCATTGCG TCAGATTGGG CCATTGCTAT TTCAACATGT 420 0 

TCATTTTTTC TCTGTTCTCT TTGAAAATCA CTCATGATTA AACCTACCTT TTCGTCATTT 426 0 
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ATTACAGCTA AGCAAATATA ATATCCATAA TGTAAATGTA ATGCCGGCAT ATTTACAAAG 4 380 

TT CAT AC CAT AAATCCCAGC TATGAATGTT AACGGTGAAA ATATAACTGA TACTAATGTC 444 0 

5 

AGTACTTGCA TAATACTATT CATTCTAAAT GACGTGTATG ACTCAAAATT TTCTCGTATT 4 SCO 

TCGTTTGTCA TTTCTTGAGC AGTACGAATG ATATTACGTT GCTTAATCAA GTGGTCATCG 4 56 0 

ATATGTTGAA TGTATAGCGA ATGT7TATTA TCTATAATCA AATCACCATT TTGTTTCATT 4 62 0 

w 

GTATCAATTA GCTCTTGCAT AGGAAACAGT ACACGTTTTA CTTTAATCAA ATCCGAACGT 4 680 

AACTTAAAGA CACTATCCAT GACCATTTTA TTAAAGCGAT CATCTACATG GCGGTCTTCA 4 74 0 

ts AAATGATAAA CACTATCTTC AAGTGCATAT ACAAAG TTG A AATATTTATC AACCATCATA 4 300 

TCTAAAATTA ATATGACGAC ATCTGCACAA TCTAATTCTG CATCTAATGT ATTCATATAC 4B60 

TTATAGACTA CTTTATTTAA TGATTCCAAC GTTTGATGAT GATATGTTAC TAATACATTG 4 92 0 

20 TCTTGTATAA AAATATTTAG TGCTATTGGT GAATAGTTTG ACCCCATAAT ACTATGGAAT 4 9 80 

ACTAAGTATT GATAATCTTT ATAAGATTTA TATTTAGCTC GTGGCATACC GTTAATTGCA 5 040 

TCATCCACTT CTAAATCATT AAAATTAAAA TGTGCTTTAA ACCATTCATT TTCTTGTTCA 5100 

25 

TTCGGTTCAT CAAAATCATA CCAAACAATA GTCGCATCTT TTGGTATCTC TTTGATATCA 516 0 

TCAACTACTT TAAACGGTTC ATATGTAGTT TGATACCGTA TCTTTAAAGC CATCGATACT 5220 

CCCCCTAAAT AACGAATTCT CTATTATTTT ATCATGAATT AAATAACGTG TATGTCTTAA 52 80 

30 

TTTATTTTAG TATGATAGTC ACTAAGGAGA TGGTTATTAT CAAACAACTT TTTACACATA 5 34 0 

CTCAAACCGT AACATCTGAA TTCATTG AC C ATAACAATCA TATGCATGAT G CAAATT AT A 54 0 0 

35 AT AT CATTTT TAGTGACGTC GTGAATCGTT TTAATTACAG CCACGGTCTT TCTTTAAAAG 54 60 

AACGCGAAAA TTTAGCATAT ACGCTATTTA CACTAGAAGA ACATACGACA TACCTCTCAG 552 0 

AATTGTCTCT TGGCGATGTA TTTACTGTTA CTTTATATAT TTATGATTAC GATTATAAGC 55 8 0 

40 GGTTGCATTT ATTTTTAACA TTAACTAAAG AAGATGG T A C ACTAGCATCA ACAAATGAAG 564 0 

TAATGATGAT GGGAATTAAT CAGCACACAC GTCGTTCTGA TGCTTTTCCT G AAT CATTTT 57 0 0 

CAACACAAAT AGCACACTAT TATAAAAATC AATCAACTAT CACTTGGCCT GAACAATTAG 57 60 

45 

GACATAAAAT AGCAATTCCA CACAAAGGAG CATTAAAATG ACAGATGCAT TACAACAAAA 5 820 

GATT CAT AT C GAATTACTAG ATTTATTAGA TGATGTTAAG TTTGAATTAA CAGAATTAAA 5 8 80 

TGCACAAAAA GGGTTATACA TTAACGGACC AGCAAATCAG CTACTTAAGC GTGGCGTGCA 5 94 0 

50 

TATGGCTTAT GTTCAAGGAC AAAAGCAAGC CATCGATAAT ATTATGACTA TTGTGGAACA 6 000 

ACAGCTTGAA AGATCAACAT TTCCTAGAAC ATTATGATAA ATTT CAAAAT GAGGTTGCTC 6 06 0 
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ATAATTTTTT AGATCAATTT TATCAAATTA AAGGGCAATA CTTTATCATC ACACATATCA 6180 

ATACACTTAT TGGTGATTTT CACTCAGAAG CTCATTAACA ATTAGTCTAT ATAACCCTTG 624 0 

5 

CTATATTTTC AAAAACAAAA CCCAATTACG TTTTCATGTC AAATATCATC TTGCATGAAA 6 3 00 

TCGTAACTGG GTCATTTATA TGTTATTAGT TATTTTGTGT TACATCCTCA TCTATCGATT 6 3 60 

TGGCAATTTG TTTAATAGCT TTATGTGATT GTCTAATTGG ATAAATTGGA AAATCATGTA 64 2 0 

w 

CCATCTTAGG ATAATCATAA AACTCAATGT ATTGATGATG TTGCAACATC ATTTGTTCAA 64 8 0 

AT AG CTTCAT ATCAGGATGT GTCATTTCAC GTCCACCACC AAA CAT AT AA ACTGGTGGCA 6 54 0 

15 ATCCTTCTAT TGTGCCATTA ATTGGCGATA TGCGCTTATC TGTTAATGGT AGGCCATTCG 6 6 00 

CCCATTTTTT CATAATCTCA TTGACACCAA ACTGACTTAG aACCGCATCT TGTTCGATTA 666 0 

AGGCGTCCGA AATATCTTTA TTAGATAGTG TTGCATCTAA AATTGGTGAG ATTAAATACA 6 72 0 

20 ATTTATTCGG TAATGGCTGT TGATTAxCTA AAAGAGATTG TACAAAGGAT AATGCCAGTG 6780 

CACCACCTGA ACCATCACCC ATGACTACGA CATTTTGATG TCCTACTTCA GATACTAATT 6 84 0 

G a T CATAAAC ACGTTGTATC GCTTGGnAAA GTATCGTCaA TATGnAAACT CTGGTGTCTT 6 90 0 

25 

TGGATAGATA GGCAGTACAA CCTCATATAA TGtACTTAAA GTGATTTTAT CCCAACAATC 6 96 0 

TCCAATGGAA CGGTGATGGT TGTAGTGCAT TGAATCCACC GTGAATATAT AAAATTTTCT 7020 

TAT CAATTTG ATGTCTGAAA TTAAAGCGAA AGACTTGCAT ATCATCTAAT GACAATTTTT 70 8 0 

30 

CTAAATTTGC TTTAACATTT AATG TTGAAG GCTGCTTATG TTTTTTTCTA TTTTCAATTT 714 0 

CTCTTTT A T A AAAAAATCTT TCAACATCTT GATCATTTTT AAACATAATC GAGCGATTGT 72 00 

j5 GAAG CAAAT A TTTATTGACA ACGCTATTCA TAACACGGTT TCTAATCAAT GTCTTAACCT 72 6 0 

ACCTTTATAT ATTTTATGTA TCCAATGATk GTCTATCCCC TACATTCTTT GCCAAAAAAA 73 2 0 

GTATATAATG TAGAAGATAT TTTCTTTTTC ACTTT CAAAT TTAAGACTAC AATTGAACAG 73 83 

40 TGATTTTTCA TCATTATAAC AGACAACTAG ACATATTGAT AAGTAAAGAA AAGAACTTTA 74 4 0 

TACGGAGGTA C CTTGCATG A CAAAT CCAAA TCAACGATTA GAACCATTTG ATGAGACATT 75 0 0 

TCAACAACCG AAT ATT CATC GTGGTAAGCG ATATGGTAAG AAAAAACGTT CATTGGTAAG 75 6 0 

45 CATGATTATT CAAATCATTG TTGTwATATT AACCACCATC GCTGGAATAC AGCATGGTGG 76 2 0 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS : 
so (A) LENGTH: 9834 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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{xi> SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GTCATtACCG amTTTCCTAG AaTCATTTAA AGATGATAAA TATACAAACG TTGGTAATTT 6 0 

5 AAAAGAAGTG AATTTTGATA AAATTGCTGC GACGAAACCC GAAGTAATCT TTATCTCTGG 12 0 

ACGTACAGCT AATCAAAAGA ATTTAGATGA ATTCAAAAAA GCTGCACCTA AAGCGAAAAT 180 

TGTTTATGTT GGTGCAGATG AAAAGAACTT AATTGGTTCA ATGAAACAAA ACACTGAAAA 24 0 

TO 

TATCGGAAAA ATTTACGATA AAGAAGATAA AGCTAAAGAA TTAAATAAAG ATTTAGATAA 3 00 

CAAAATTGCT TCAATGAAAG ATAAAACGAA AAACTTCAAT AAAACTGTTA TGTATTTACT 36 0 

AGTTAACGAA GGTGAATTAT CAACATTTGG ACCTAAAGGT CGTTTTGGTG GATTAGTTTA 42 0 

CGATACATTA GGATTCAATG CAGTTGATAA AAAAGTAAGT AATAG CAATC ATGGACAAAA 480 

TGTTTCTAAC GAATATGTTA ATAAAGAAAA TCCAGATGTT ATTTTAGCGA TGGATAGAGG 54 0 

20 TCAAGCGATA AGTGGTAAAT CAACTGCGAA ACAAGCATTA AATAATCCTG TATTAAAAAA 600 

TGTTAAAGCA ATTAAAGAAG ACAAAGTATA TAATTTAGAT C CTAAATT AT GGTACTTTGC 660 

AGCTGGATCA ACTACAACTA CAATTAAACA AATTGAGGAA CTTGATAAAG TTGTAAAATA 720 

^5 ATTTTAAAAG AGGGGAACAA TGGTTAAAGG TCTTAATCAT TGCTCCCCTC TTTTCTTTAA 7 80 

AAAAGGAAAT CTGGGACGTC AATCAATGTC CTAGACTCTA AAATGTTCTG TTGTCAGTCG 84 0 

TTGGTTGAAT GAACATGTAC TTGTAACAAG TTCATTTCAA TACTAGTGGG CTCCAAACAT 900 

30 

AGAGAAATTT GATTTTCAAT TTCTACTGAC AATGCAAGTT GGCGGGGCCC AAACATAGAG 9G0 

AATTTCAAAA AGGAATTCTA CAGAAGTGGT GCTTTATCAT GTCTGACCCA CTCCCTATAA 102 0 

TGTTTTGACT ATGTTGTTTA AATTTCAAAA TAAATATGAT AG TGAT ATTT ACAGCGATTG 108 0 

35 

TTAAACCGAG ATTGGCAATT TGGACAACGC TCTACCATCA TATATTCATT GATTGTTAAT 114 0 

TCGTQTTTGC ATACACCGCA TAAGATTGCT TTTTCGTTAA ATGAAGGCTC AGACCAACGC 12 0 0 

TTAATGGCGT GCTTTTCAAA CTCATTATGG CACTTATAGC ATGGATAGTA TTTATTACAA 1260 

40 

CATTTAAATT T AATAG CAAT AATATCTTCT TCGGTAAAAT AATGGCGACA SCgTGTTTCA 13 20 

GTATCGATTA ATGAAC CAT A AACTTTAGGC ATAGACAAAG CTCCTTAACT TACGATTCCT 13 80 

45 TTGGATGTTC AC CAAT AATG CGAACTTCAC GATTTAATTC AATG CCAAAT TTTTCTTTGA 144 0 

CGGTCTTTTG TACATAATGA ATAAGGTTTT CATAATCTGT AGCAGTTCCA TTGTCTACAT 1500 

TT AC CAT AAA ACCAGCGTGT TTGGTTGAAA CTTCAACGCC GCCAATACGG TGACCTTGCA 1550 

50 AATTAGAATC TTG TAT CAAT TTACCTGCAA AATGACCAGG CGGTCTTTGG AATACACTAC 162 0 

CACATGAAGG ATACTCTAAA GG TTGTTT AG ATTCTCTACG TTCTGTTAAA T CATC C ATTT 168 0 
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AGTGTTCTTT TTGAATAATG CTATTACGAT AATCTAACTC TAATTCTTTT GTTGTAAGTT 1800 

TAATTAACGA GCCTTGTTCG TTTACGCAAA GCGCATAGTC TATACAATCT TTAACTTCGC I860 

CACCATAAGC GCCAGCATTC ATATACACTG CACCACCAAT TGAACCTGGA ATACCACATG 192 0 

CAAATTCAAG GCCAGTAAGT GCGTAATCAC GAGCAACACG T GAGA CAT CA ATAATTGCAG I960 

CGCCGCTACC GGCTATTATC GCATCATCAG ATACTTCGAT ATGATCTAGT GATAATAAAC 204 0 

TAATTACAAT ACCGCGAATA CCACCTTCAC GGATAATAAT ATTTGAGCCA TTTCCTAAAT 2100 

ATGTAACAGG AATCTCATTT TGaTAGGCAT ATTTAACAAC TGCTTGTACT TCTTCATTTT 2160 

TAGTAGGGGT AATGTAAAAG TCGGCATTAC CACCTGTTTT AGTATAAGTG TATCGTTTTA 222 0 

AAGGTTCATC AACTTTAATT TTTTCATTTG GGATAAGTTG TTGTAAAGCT TGATAGATGT 22 80 

CTTTATTTAT CACTTCTCAG TACATCCTTT CTCATGTCTT TAATATCATA TAGTATTATA 2 34 0 

CCAATTTTAA AATTCATTTG CGAAAATTGA AAAGAAAGTA TTAGAATTAG TATAATTATA 24 0 0 

AAATACGGCA TTATTGTCGT TATAAGTATT TTTTACATAG TTTTTCAAAG TATTGTTGCT 24 6 0 

TTTGCATCTC ATATTGTCTA ATTGTTAAGC TATGTTGCAA TATTTGGTGT TTTTTTGTAT 2520 

TGAATTGCAA AG CAATAT C A TCATTAGTTG ATAAGAGGTA ATCAAGTGCA AGATAAGATT 2 580 

CAAATGTTTG GGTATTCATT TGAATGATAT GTAGACGCAC CTGTTGTTTT AGTTCATGAA 264 0 

AATTGTTAAA CTTCGCCATC ATAACTTTCT TAGTATATTT ATGATGCAAA CGATAAAACC 27 0 0 

CT A CAT AATT TAAGCGTTTT TCATCTAAGG ATGTAATATC ATGCAAATTT TCTACACCTA 276 0 

CTAAAATATC TAAAATTGGC TCTGTTGAAT ATTTAAAATG aTGctACCGC CAATATGTTT 2 820 

TGTATATTTT ACTGGGCTGT CTAAGAGGTT GAATAATAAT GATTCAATTT CAGTGTATTG 2 880 

TGATTGAAAA CAATTAGTTA AATCACTATT AATGAATGGT TGAACATTTG AATACATGAT 2 94 0 

AAACTcCTTT GATATTGAAA ATTAATTTAA TCACGATAAA GTCTGGAATA CTATAACATA 3 000 

ATTCATTTTC ATAATAAACA TGTTTTTGTA TAATGAATCT GTTAAGGAGT GCAATCATGA 3 060 

AAAAAATTGT TATTATCGCT GTTTTAGCGA TTTTATTTGT AGTAATAAGT GCTTGTGGTA 3120 

ATAAAGAAAA AGAGGCACAA CATCAATTTA CTAAGCAATT TAAAGATGTT GAGCAAAAAC 3180 

AAAAAGAATT ACAACATGTC ATGGATAATA TACATTTGAA AGAAATTGAT CATCTAAGTA 3 24 0 

AAACTGATAC AACTGATAAA AATAGTAAAG AATTTAAGGC ACTACAAGAA GATGTTAAAA 3 3 00 

ACCATCTCAT ACCTAAATTT G AAG CAT ATT ATAAGTCAGC AAAAAATTTG CCTGATGATA 3 3 60 

CAATGAAAGT T AAG AAA TT A AAAAAAGAAT ATATGACGCT TGCAAATGAG AAGAAGGATG 34 20 

CGATATATCA ATTAAAAAAA TTCATAGGTT TATGTAATCA ATCTATCAAG TATAACGAAG 34 80 
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AATTAGCTGA TAATAAAAGT GAAGCAACTA ATCTTACGAC AAAATTAGAA CATAATAATA 3 6 00 

AAGCGTTAAG AGATACTGCG AAGAAGAACC TAGATGATAG TAAAGAAAAT GAAGTAAAAG 366 0 

5 GCGCGATTAA AAATCACATT ATGCCAATGA TTGAAAAGCA AATTACCGAT ATTAACCAAA 3 720 

CTAATATTAG TGATAAGCAT GTTAATAATG CAAGGAAAAA CGCAATAGAA ATGTATTACA 378 0 

GTCTGCAGAA CTATTATAAT ACACGTATTG AAACAATAAA GGTTAGTGAG AAGTT AT CAm 3 34 0 

AAGTCGATGT AGATAAGTTG C CG AAAAAGG GTATAGATAT AACTCACGGC GATAAAGCCT 3 90 0 

TTGAAAAAAA GCTTGAAAAA TTAGAAGAAA AATAACTATA ATCATTTTTC AAAGTTAAAA 3 96 0 

ATTTTGAATT TATGGTTAAC ATGTCAACTT ACTATGTGTA TAATGGTAAA CATTGATATT 4 02 0 

AACTATATGT ATAAAAATGT CACGCAGATG CTATTTAAAT GTGATAAATA TTTTTAGAGG 4 080 

TGAATAGAGT GGCTATAAAG CTAAGTTCAA TTGACCAATT TGAACAGGTT ATTGAGGAAA 414 0 

20 ATAAATATGT TTTTGTATTA AAACATAGTG AAACTTGTCC AATATCGGCA AATGCGTACG 4 2 00 

ATCAATTTAA TAAATTTTTA TATGAACGCG ATATGGACGG TTATTATTTG ATTGTCCAAC 42 6 0 

AAGAACGCGA TTTGTCAGAT TATATTGCTA AAAAAACGAA CGTTAAACAT GAATCACCTC 43 20 

? s AAGCATTTTA TTTTGTAAAT GGTGAAATGG TTTGGAATCG AGACCACGGT GATATCAATG 43 8 0 

TGTCGTCATT AGCACAAGCA GAAGAATAAT GAAACTATAG GGTTGGAACA TTTTGCCTTA 444 0 

CACTACTAGA CGTGAATAGC ACAACTTAAA TTCGTGTGAA TCAGAGTAGT TTGGCTATAA 4 500 

30 

TGATGTTCTG ACCTTTTATT TTATGTCACC TTTAGAAGCA GTT AAGTT AG TACTTTTTTA 4 56 0 

CAAACATATG TATAATATAT TCGAGTATTT TTATTGAAAa tATTTTGGAA AACGACGAAT 4 6 20 

CCAATAAGAA AATTT AAA CA TGATTTGTAA GTTAGTTTAA TAGGAAATAT A TG CT AAA CC 4 6 80 

35 

AAAAGAAGCA TATTGTTATT TACTGGAATA ATTAATAATC ATGTCATGTT AAATGTTAGC 4 74 0 

ATATAATCAC GAGATAAAAT CTAAAATTTA AGATTAATCT TTTATGAATA AAAAACGTAT 4 8 00 

4Q CACAACAAAT AATAAAGTAA GGTGGTCAAG GTTATGAAAG TATT AG TAG C CATGGATGAG 486 0 

TTTCATGGAA TTATTTCAAG TTATCAAGCT AATAGATATG TTGAAGAGGC AGTTGCAAGC 4 920 

CAAATTGAAA CTGCAGATGT AGTTCAAGTA CCATTGTTTA ATGGAAGACA TGAATTATTA 4 9 80 

45 GATTCTGTAT TTTTATGGcm ATCTGGGcaA AAGTATCGTA TACCAGTACA TGATGCAGAT 504 0 

ATGAATGAAG TTGAAGGTGT TTACGGACAA ACTGATACAG GGATGACCGT TATCGAGGGG 5100 

AATTT AT TTT TAAAAGGTAA AAAACCAATT GTTGAACGAA CAAGTTATGG TTTAGGAGAA 5160 

50 ATG ATT AAA C ATGCATTAGA TAACGACGCA AAACATGTTG TAATTT CACT AGGTGGGATT 52 20 

GATAGTTTTG ATGCTGGTGC AGGTATGTTA CAAGCATTAG GTGCTCAATT CT ATG ATG AC 52 8 0 
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GATATGTCGA ACTTACACCC TAAAATGGAA ACAGCAAGAA TTCAAGTAAT GTCGGATTTT 54 0 0 

TCAAGTCGAT TATATGGTAA GCAAAGTGAA ATCATGCAAA CTTATGATGC GCATCAGTTG 54 6 0 

5 AATCATAATC AAGCAGCAGA AATCGATAAT TTAATTTGGT ATTTTAGTGA GTTATTTAAA 55 2 C 

AGTGAATTGA AAATTGCAAT TGGTCCAGTT GAACGTGGTG GTGCTGGTGG TGGAATTGCA 55 8 0 

GCAGTCTTGA ATGGACTGTA TCAAGCTGAA AT ATT AA CCA GTCATGCATT AGTAGACCAA 564 0 

to 

CTAACACATT TAGAAAATTT AGTTGAACAA GCGGATTTAA TTATTTTTGG AGAAGGATTA 570 0 

AATGAAAATG ATCAGTTGCT AGAAACGACA ACATTGCGTA TTGCAGAACT TTGTCATAAA 576 0 

CATCAAAAGG TTGCCATTGC AATTTGTGCA ACTGCTGAAA AGTTTGATTT ATTTGAATCA 582 0 

CAAGGGGTTA CAGCAATGTT TAATACATTT ATCGATATGC CAGAAACTTA TACTGACTTT 5880 

AAAATGGGtT ACAAATTAGG CATTATACGG TTCAGTCTTT AAAACTGTTG AAAACACATT 5 94 0 

2Q TTAATGTTGA GGTTTAGTAA AGAAGGACTA AATTGGTGAT GCTGTCATGA TGGTTAATAA 6000 

CATTTATGAT GGTTAGCAAA ACGAATTAGA AGATCGAAAG TATACG T AAA AAATATGAAA 6 060 

AATCACGCTA TCATTGCACT GAATGTTAGC GTGATTTTTA TATATTAATT AAGCCTGAGT 6120 

25 TGAACTAGTA TATAATCGTT GGTTTTTAGT GATTTTCAGC GATATCTTCT ACAATTCCAA 6180 

TGATTACTTG TACTGCTTTT TCCaTAACAT CAATGGATGC a T ATT CAT AT GGGCCGTGGA 6 24 0 

AGTTACCGCA ACCTGTAAAG ATGTTTGGAG TTGGTAACCC CATAAATGAC AATTGTGAAC 6 3 00 

30 CATCTGTACC ACCGCGAATA GGTTCAGTGT TTGCTGGAAT ATCTAATTTG GCAAAGACAC 63 60 

GTTTAGGTAT ATCAATAATA TGAGGCAATG GTAATATTTT TTCTGCCATA TTGAAATATT 64 2 0 

GATCCGATAT ATCAACTTTA ACTGGATAAT TTTCAAAATG GGCATTGATA TCGTCACGTA 64 3 0 

35 

TTTCTAAAAT A CG TTTCTT A CGCAATTCGA ATTGTTTTTT ATCATGATCA CGAATAATGT 654 0 

ATTGCAAAGT TGCTTTTTCA ACAGTTCCTT CAAAGTTCAT TAAGTGATAA AAGCCTTCGT 6 6 00 

ATCCTTCTGT TCGCTCCGGA ACTTCACTAT CAGGTAGCAA ACTATCGAAT TGTTCACCTA 6 6 60 

40 

AACGTATTGC GTTTACCATT G CATTTTT AG CTGAACCAGG ATGAACATTT ACACCGTGGC 6 72 0 

ATGTAATAAC CGCTTCAGCA GCGTTAAAGC TTTCATATTG TAATTCTCCA TATTGACTAC 6 78 0 

45 CATCCATAGT ATAAGCAAAA TCAGCATTGA AGCGGTCAAC ATCAAATTTA TGTGGACCAC 6 84 0 

GACCGATTTC TTCGTCTGGT GTAAATCCAA TGCGAATGGT ACCATGTTTA ATTTCTGGAT 6 90 0 

GTTCTTGTAA ATAACAAATA GCTTCCATAA TTTCCACAAT ACCCGCTTTA TCGTCTGCAC 6 960 

50 CTAGTAACGA TGTACCATCA GTTACCATTA ATGTATGACC AACTAAACTG TTAAGTTCTG 70 2 0 

GAAATACTTT AGGATCTAAG ACACGTTTAG TATTGCCTAG TTTGTATGGC TTACCATCAT 70 8 0 
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GCGCCAAAAA TCCAACTGTT GGGACGTCGA CATC GAT GIT ACTTTCTAAT GTAGCAAATA 72 0 0 

AGTAGCCATT TTCATCTAAA TCAGTTGGCA ATCCTAATTG TTGTAATTCT TTTTCTAATA 72 60 

5 AATGTAACAA ATCCCATTGC TTTTCAGTTG AAGGTGTTGT TGTAGATTTT GGATCAGATT 73 2 0 

GCGTATCAAT TGTCGTATAT CTTGTTAATC TATCTATCAA TTGGTTCTTC ATTATATTCG 73 9 0 

ACCCCTTAAA CTCTATTATT CATGTTGTAA GATTTTTTAT ATGTCTTACC TTTGATTTTA 744 0 

w 

CCATACAGTT GTTTGATACG TGTGTATAGG TAATATAGAA TTTCAGAAAC TAATATACCG 7 50 0 

AAAGCAATCG CACCTGAAAT CAGTGTAcTT CTAAAAATGT ATTTACAGCA CTTGTATAAT 756 0 

CATTTGATAC TAAAAAACGA GTCGCTTGAT AAGCTGCACC ACCAGGTACT AATGGTATAA 7620 

15 

TGCCTGGCAC TATGAATATA ATTACCGGTC GTTTATATCT GCGACTCATA GTATGACTCA 7680 

TTAAGCCTAA AATTAAGCTT CCCAAAAATG AAGCGCCAAC TTTTCCAAAC TCTAAATCTA 774 0 

2Q CCGTTAATTG GTAAATCGTC CATG CAATGG CACCCACAAA TCCACATGCT ACT AAGAGG C 7 800 

GTTTGGGTGC ATTGAAAATG ATAGAGAAAA GTACTGTTGA TATAAAGCTG ATTGTAAAAT 7860 

GAAATAAATA AAATAGCATG CTTTAACAGT CCTTCCTTAA ATGATTAATA AAACGATTGC 7 920 

25 GACACCAGCA CCGATTGCGA ATGCTGTTAA TGCAGCTTCA ACACCGCGAG ACATACCTGC 7 930 

AAGTAATTCA CCCGCTAATA AATCTCGAAT GGCATTGGTA ATTAATATAC CAGGGACAAG 8 04 0 

TGGCATGACA CTGG CT AT AG TAATGATATC TTGATTGGTT GCAATGCCTA ATTTAGTAAA 8100 

30 TGTGGCTGCA ATGGATATGA CCACAGCGGC TGCAACAAAC TCTGAGAAAA ATTTAATTTG 8160 

TATATAGCGT tGCACAAAGC TGAATGTTAA AAATGCGGAT CCGCCAGCAA TGACTGCAAT 82 2 0 

CCAACAATCT GATGCGACAC C A C CAAA CAT AAATAGGAAG AAGCCACATG CAATGG CAGC 82 3 0 

35 

TGCAAAGAAA TTCGTTAAAA AAGAATATTG TAATGAPGCA TGCTGTAAAT GAATAAATTC 834 0 

AGATTTAGCT TCATCAATTG TGAGTTCTTT ATTTGATATT TTACGTGAAA GACTATTCGT 84 0 0 

TAAAGCGATT TTCTCTAAAT CTGTTGTACG CTCTTGTACA CGAATTAATC TTGTACTTGT 84 6 0 

40 

TCGATCGTTT AATGAAAAAA TAATTGCAGT TGAACTGACA AAACTATATG TATTATGAAG 8520 

ACCATAACTA TGTGCGATAC GGTTCATTGT ATCTTCAACT CGATATGTTT CAGCACCTGA 85 80 

45 TTCaAGTAAA ATTCTACCTG CAATTAATAC AACATCAATC ACTTTGTTTT CATCTATAAT 864 0 

TGTGATTGAA TCTGGCATAT CAATTCACCT CCAATGATAT GTGTTATTTA TTTGAACAAT 37 00 

TGaAGTTTAC AACTTGTTGT TACAACTTTC AATAGTGAGA CTTTGTGTTA GTATGATGAA 8 76 0 

50 CTTGTATGGT TCAAATTTAA ATAAGAAAAA CTGTTAATCT TTGCTATTAT ACTATGATTT 8820 

AATAATAGCA AAGGATTAAC AGTTTTGTCG TTGTTATAAA TTGATAATAG GG IT AAA CAT 8880 
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TTTACGCTGT GATTTTGGAT CGTCATCTGT TAAATAACCA ACACCGATAG ACACTGACAA 900 0 

TTTAA7AACT TCTTTGTTTG GTAAATGGAA TGATGATTTT TCAACAC CCG AACGAATATT 906 0 

TTCAGCTAAT TTAACACTTT GATCAAGTGA ATAATTGTGA ATGACAACTG AGAACTCTTC 912 0 

GCCACCATTT CTAAAAATTT TAAATTGATT CGGCACATAG TTTTTAAGTA ATTGAGACAT 9180 

TTGTTTTAAT ACAGCATCAC CTGATTTGTG TGAGTAGGTA TCATTGaCAT CTTTAAATCC 924 0 

ATCGATATCG ATTAATAATA ATGCGATACT TTGATGTTCT TTTTCAGCTT TTCGTGAAAT 930 0 

TTCATTTAAA TGTCTATCAA ATTCTTTTAC ATTACCTAAG CCTGTTAAGT AATCATATTT 936 0 

ATCTTCGTTT TCATAACGAT TTACGAGTGA GAAGAAATGC CAAATATCGA CAAATGTTAT 942 0 

CGCTGAAGCT AAAGTGATAA TTAATGAAAT TGGTATTAAA ATGATAACTT CCGATAGTGT 94 80 

GTAAATAGGA CTCACTAACG CGACACCAAA TAAAATGATT ATTGTAACAA CATTAAGTAT 954 0 

TAATAATGAT AG CACAT CAT TTTGTTTTAA AAATGGTCCA AT AG CA CTTG TTACTGCAGC 9600 

AATAACAATC AACGTAACAC CGTACATAAT CGAGTTGTTA AATACTACAA TTTCAACAAT 9660 

TGCTACAATT ACTGTGGCAG ATAATGTATA GACCATATTT GTAAATCTAC CTAAAAACAA 9720 

TAAAGGAACG AATGTTAAGT GAATTAAATA ATCTTCACGA TAAGGGATAG GGTAGACAGA 978 0 

TAATAATAAT GATACGATTG TCATTAAAAC AGTGACATAA GCCTTAGAAA AAAC 9834 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 23439 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS ; double 

(D) TOPOLOGY: linear 

"(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

TCTCAATCAG AT G AAAAATT GCATATCGTA GGTTTTACAG AAAGTGCAAA ATATAATGCG 6 0 

TCATCAGTCA TTTTCACGAA TGACGCTACC ATTGCCAAGA TCAATCCTAG ATTGACTGGA 120 

GATAAAATTA ATGCAGTTGT TGTACGTGAT ACAAATTGGA AAGACAAAAA ATTAAACCAA 180 

GAG CTTG AAG CGGTAAGTAT TAATGA CTTT ATTGAAAATT TACCAGGTTA TAAACCACAG 24 0 

AACTTAACAT T AAA CTTT AT GATTTCATTC TTATTTGTCA TTTCAGCTAC AGTTATAGGC 3 00 

ATTTTCCTAT ATGTCATGAC ATTACAAAAG ACGAGTTTAT TTGG CATATT AAAAGCTCAA 360 

GGATTTACGA ATGGCTATTT GGCGAATGTG GTAATTTCGC AGACGGTCAT ATTAGCACTA 42 0 

TTTGG7ACGG CATTTGGCTT ACTGTTAACA GGCGTTACAG GTGCATTTTT ACCTGATGCA 480 
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TCTGTATTAG GAAGTTTATT CTCCATTTTA ACAATTAGAA AAATAGATCC GTTAAAGGCG 6 00 

ATTGGGTAGG AGGTGTAGCA AATGTTGAAA TTTGAAAATG TAACAAAGTC ATTTAAAGAT 66 0 

5 GGGAATCGTA ACATTGAAGC GGTTAAAGAT ACAAATTTTG AGATAAATAA AGGTGATATT 72 0 

ATAGCATTGG TTGGACCTTC TGGCTCTGGT AAAAGTACAT TTCTAACTAT GGCAGGTGCT 78 0 

TTACAAACAC CGACATCTGG GCACATTTTA ATCAATAACC AAGATATTAC GACAATGAAG 84 0 

10 

CAAAAAGCAT TGGCAAAAGT TAGAATGTCT GAAATAGGTT TTATTTTACA AGCTACAAAC 90 0 

CTTGTACCAT TTTTAACGGT AAAGCAACAA TTTACATTAT TGAAAAAGAA AAATAAGAAT 96 0 

GTTATGTCTA ATGAAGACTA TCAGCAACTT ATGTCACAAT TAGGTCTAAC TTCATTGCTT 102 0 

IS 

AATAAGTTAC CTTCAGAAAT TTCAGGTGGT CAGAAACAAC GTGTGGCGAT AgCaAAGCGT 108 0 

TATATACGAA TCCGTCGATT ATTTTAGCGG ATGAACCTAC CGCGGCGTTA GATACTGAAA 114 0 

2Q ATGCGATTGA AGT CATT AAA ATTCTACGTG ATCAAGCCAA ACAAAGAAAG AAAGCATGTA 120 0 

TTATTG7TAC ACATGATGAA CGACTTAAAG CATATTGTGA TCGTTCATAT CATATGAAAG 1260 

ATGGCGTCCT TAATCTTGAA AATGAAACAG TAGAATAGTT TTATTAAGCC GGTACATCAT 1320 

25 GTGCCGGTAT TTTTATG TTT ATGTATTATT TGAATAAACT TTCACATTCA ATTAATAATA 13 80 

ATTATTATCG AAAATCAGAA ATATTCCGTG AAATATAATA TTTTTTGTAG TAAAATGGCC 144 0 

TCTAAGTATT CAATATTTAA ATATGGGGAT TGAATATAAA ATTATCGTAA TGGGGGTCAA 1500 

30 TGGTTATGGA TTTATTGATA GGTACTTTAT TTTTATTTTT GGTCTTAGTG ATTTTTACAT 1560 

TATTTACATA TAAAGCGCCT AATGGTATGC GTGCCATGGG AGCATTAGCT AATGCAGCAA 1620 

TCGCAACATT TTTAGTGGAA GCATTTAATA AATATGTTGG TGGCGAAGTA TTCGGTATTA 1680 

35 

AATTTTTAGA AGAGCTAGGA GACGCTGCGG GAGGTCTAGG TGGTGTCGCT GCCGCTGGAT 174 0 

TAACAGCATT AGCTATCGGT GTGTCACCAG TATATGCATT AGTTATAGCA GCCGCGTGCG 18 00 

GTGGTATGGA TTT ATT A C CA GGTTTCTTTG CGGGTTATAT GATTGGATAT GTGATGAAAT I860 

40 

ATACAGAGAA ATATGTGCCG GATGGTGTCG ACTTAATTGG ATCGATTGTC ATCTTAGCGC 1920 

CATTAGCTCG TCTTATTGCA GTATTATTAA CGCCAGTAGT GAATAGTACA TTGATTCGAA 198 0 

45 TTGGTGATAT TATCCAAAGT AGTACGAATA CGAATCCAAT TATCATGGGT ATCATTTTAG 204 0 

GTGGTATTAT TACGG7TGTC GGCACAGCGC CATTGAGTTC AATGGCATTG ACAGCATTAT 2100 

T AGG TTT AAC GGGTGTACCT ATGGCTATTG GTGCCATGGC AGCATTTAGT TCGGCATTTA 2160 

50 TGAATGGGAC GCTATTCCAT CGCTTAAAAT TAGGTGATCG TAAGTCTACG ATTGCAGTAA 2220 

GTATTGAACC TTTATCACAA GCAGATATTG TATCAGCCAA TCCAATTCCA ATCTATATTA 22 80 
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ATGCGACAGG TACAGCTACA CCGATTGCAG GATTTTTAGT TATGTTTGGA TTTAATCATC 24 0 G 

CGACGACAAT TGTGATTTAT GGTGTAGTAA TGGCGATTGT AGGTGCGCTT G CAGGTTATC 24 6 0 

5 TTGGTTCAAT TGTATTTAAA AAATATCCAA TTGTTACTAA GCAAGACATG ATTAATCGAG 2 52 0 

GTGCAGTAGA CGCA7AGCA7 CATCATATTG AATAGTAAAA ACAAATAAAA CATAGTAACG 2 58 0 

TGATTCAGTC GATGTAACAG TCGATAATGA GTCACGTTTT TTTATAGAAA AATACAAGAC 2 64 0 

10 ATAAAAATGT CATAATTTAT TGTCGACAAA TATCATACTG TATAAACATT TATCATTTTC 27 0 0 

TCAAGTACCT TTTACACGAT GGAATGAACT TA CTTTTT AC GAAATTATGC GTATTTTATA 2 76 0 

AACAAATATC ATTGATATAA CGGTAAATGT AAGCGTTTAC AACAGAAATA ACAGCATGCT 2820 

15 

ACGATATTTT TGTAAATTCA CTGATTCAAG TATTTTAAGT CAATATGAGG AGGGATGTTA 2880 

TGAGCGATTC TGAGAAAGAA ATTTTAAAAA GAATTAAAGA TAATCCGTTT ATTTCACAAC 2 940 

GTGAACTTGC TGAGGCAATT GGATTATCTA GACCCAGCGT AGCAAACATT ATTTCAGGAT 3 0 00 

20 

TAATACAAAA GGAATATGTT ATGGGAAAGG CATATGTTTT AAATGAAGAT TATCCTATTG 3060 

TTTGTATTGG CGCAGCGAAT GTAGATCGTA AGTTTTATGT GCATAAAAAT TTAGTTGCAG 312 0 

25 AAACATCAAA TCCTGTAACG TCAACACGCT CTATTGGTGG CGTAgCAAGA AATATTGCTG 3180 

AGAACTTAGG TAGGCTTGGC GAAACGGTCG CTTTTTTATC TGCTAGTGGA CAAGATAGTG 324 0 

AATGGGAAAT GATTAAACGA TTGTCCACAC CATTTATGAA TTTGGATCAT GTTCAACAAT 33 00 

30 TTGAAAATGC GAGTACAGGT TCATATACAG CTTTAATTAG TAAAGAAGGC G A CAT G A CAT 33 6 0 

ATGGCTTaGC AGATATGGAA GTGTTTGACT ACATTACGCC TGAATTTTTA ATTAAGCGTT 34 20 

CACACTTATT GAAAAAGGCT AAGTGCATTA TTGTAGATTT GAATTTAGGC AAAGAGGCAT 34 80 

35 TAAACTTCTT ATGTGCCTAT ACCACGAAAC ATCAAATCAA ATTAGTTATC ACCACGGTTT 3 54 0 

CTTCCCCAAA AATGAAAAAT ATGCCTGATT CATTACATGC TATTGATTGG ATTATCACGA 3 6 00 

ATAAAGATGA AACAGAAACA TACTTAAATT TAAAAATAGA ATCTACTGAT GATTTAAAAA 3 66 0 

40 

TAGC7GCTAA ACGCTGGAAT GATTTAGGTG TTAAAAATGT TATTGTGACA AATGGCGTGA 3 72 0 

AAGAACTCAT TTATCGAAGT GGTGAGGAAG AAATCATTAA GTCAGTTATG CCATCAAATA 3 780 

GTGTGAAAGA TGTTACAGGT GCAGGCGATT CATTCTGTGC TGCAGTAGTG TATAGCTGGT 3 84 0 

45 

TAAATGGGAT GTCTACTGAA GATATATTAA TTGCTGGTAT GGTTAACGCA AAGAAAACGA 3 900 

TAGAAACGAA ATATACAGTT AGGCAAAACC TAG AT CAACA GCAACTTTAT CACGATATGG 3 96 0 

5Q AGGATTATAA AAATGGCAAA TTTACAAAAG TATATTGAGT ATTCTCGAGA AGTTCAGCAA 4 020 

GCACGGGAGA ACAATCAACC GATTGTAGCA TTAGAATCAA CAATTATTTC GCATGGTATG 4080 
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GCCATTCCAG 


CAACCATAGC 


CATTATAGAT 


GG CAAAATTA 


AAATTGGTTT 


AG AAAG CGAA 


4200 




GATTTAGAAA 


TACTGGCAAC 


TAGTAAAGAC 


GTTGCTAAAG 


TATCTAGAAG 


GGATTTAGCA 


4260 


5 


GAAGTTATTG 


CGATGAAGTG 


TGTTGGTGCT 


ACTACTGTAG 


CGACGACGAT 


GATATGTGCT 


4320 




GCAATGGCTG 


GTATTCAATT 


TTTTGTTACA 


GGAGGTATTG 


GGGGCGTCCA 


TAAAGGTGCA 


4380 


10 


GAA CAT AC G A 


TGGACATTTC 


AGCAGACTTA 


GAAGAACTGT 


CTAAAACAAA 


TGTCACTGTT 


4440 


ATCTGTGCAG 


GTGCCAAATC 


AATTTTAGAC 


TTACCTAAGA 


CGATGGAGTA 


TTTAGAAACA 


4500 




AAAGGCGTTC 


CAGTTATTGG 


ATATCAAACG 


AATGAATTGC 


CAGCATTCTT 


CACTCGCGAA 


4560 


15 


AGCGGTGTTA 


AGTTAACAAG 


TTCGGTTGAA 


ACGCCAGAAC 


GACTTGCTGA 


CATTCATTTA 


4620 




ACAAAACAGC 


AGTTAAATCT 


TGAAGGTGGC 


ATTGTTGTTG 


CTAATCCAAT 


TCCATATGAG 


4680 




CATGCCTTAT 


CAAAAGCATA 


TATTGAGGCA 


ATCATAAATG 


AAGCTGTTGT 


TGAAGCGGAA 


4740 


20 


AATCAAGGTA 


TTAAAGGTAA 


GGACGCCACA 


CCGTTCTTGT 


TAGGGAAAAT 


TGTAGAAAAA 


4800 




ACGAATGGTA 


AAAGTTTAGC 


AGCAAATATA 


AAACTTGTTG 


AAAACAATGC 


GGCGTTGGGT 


4860 




GCTAAAATTG 


CTGTCGCTGT 


TAATAAATTA 


TTGTAGGTGA 


TGATACATGA 


ATATTTTATT 


4920 


25 


CGCTATCACA 


GGGATAGCAT 


TTGCACTATT 


TGTTGCGTTT 


TTATTCAGTT 


TTGATCGTAA 


4980 




AAAAATAGAC 


TTCAAAAAGA 


CGTTAATAAT 


GATATTTATT 


CAAGTGTTGA 


TCGTGTTATT 


5040 




TATGATGAAC 


ACAACGATTG 


GTTTGACAAT 


TTTAACTGCA 


CTAGGTTCAT 


TTTTTGAAGG 


5100 


30 


GCTAATAAAT 


ATTAGTAAAG 


CAGGCATAAA 


TTTTGTTTTT 


GGAGATATAC 


AAAATAAAAA 


5160 




TGGCTTTACG 


TTCTTTTTAA 


ACGTATTACT 


GCCATTAGTT 


TTTATTTCTG 


TATTAATAGG 


5220 


35 


CATCTTTAAT 


TATATTAAGG 


TATTACCATT 


TATTATCAAA 


TATGTAGGTA 


TCGCTATTAA 


5280 


TAAAATAACT 


AGAATGGGGC 


GCTTAGAAAG 


TTATTTTGCT 


ATTTCAACAG 


CAATGTTTGG 


5340 




GCAACCAGAA 


GTATATTTAA 


CAATAAAAGA 


TATTATTCCA 


AGATTATCTA 


GAG CG AAATT 


5400 


40 


ATATACAATT 


GCGACGTCTG 


GTATGAGTGC 


TGTTAGTATG 


GCAATGCTAG 


GTTCATATAT 


5460 




GCAGATGATT 


GAACCCAAGT 


TCGTAGTTAC 


AGCAGTAATG 


TTAAATATTT 


TTAGTGCGCT 


5520 




TATCATCGCC 


AGTGTAATCA 


ATCCCTATAA 


ATCTGATGAT 


ACTGATGTTG 


AAATTGATAA 


5580 


45 


CTTAACGAAA 


TCCACAGAAA 


CTAAAACATT 


GAATGGAAAA 


ACAGGAAAAC 


CTAAGAAAGT 


5640 




TGCCTTTTTC 


CAAATGATTG 


GTGATAGTGC 


GATGGATGGG 


TTTAAAATCG 


CTGTTGTAGT 


5700 




AGCCGTAATG 


TTGTTAGCAT 


TTATTTCATT 


AATGGAAGCA ATTAATATCA 


7GTTTGGTAG 


5760 


50 


TGTTGGTTTG 


AACTTTAAAC 


AG CTT ATTGG 


CTATGTGTTT 


GCACCAATCG 


CATTCTTAAT 


5820 




GGGGATTCCA 


TGGAGCGAAC 


TGTTCCAGCT 


GGCTCTTTAA 


TGGCGACTAA 


ATTAATTACA 


5880 
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CAAGG TAT C A 


TTTCAGTTTA 


CTTAGTAAGC 


TTCGCTAATT 


TTGGTACGGT 


TGGTATCATC 


6000 




GTAGGTTCAA 


TTAAAGGCAT 


TAG TG AT AAA 


CAAGGAGAAA 


AAGTTGCATC 


CTTTGCAATG 


6060 


5 


AGGTTGCTAC 


TTGGTTCAAC 


TCTAGCTTCA 


ATCATTTCAG 


GATCAATCAT 


TGGCTTAGTA 


6120 




TTGTAAATGA 


ATCGAAGTAC 


CTAAATTAAA 


TTCATGGCAA 


AGCTAAACCC 


CGTCACCAAG 


61B0 


10 


TTGGCGCAAC 


AGCGcATgcA 


TAACTTAGTG 


ACGGGGTTTT 


ATCATAACAA 


TCTACTTTTT 


6240 


CGTAGCCGTT 


TTTGAAATGT 


ATGTTGATGG 


TTTATCTTTT 


TCAAAAATTG 


TTAATCCCGT 


6300 




TATATCTTTT 


TTATGTTTTG 


AAGGGACAAT 


GAAGCTAAGT 


ATATAAGCAA 


AG A CAAAAG C 


6360 


15 


AACTGTAAAT 


GAAATGGTAG 


ATACATAGAA 


AGGTGAGTTA 


CCTTTGCCAA 


CACCATTATA 


6420 




GACATAAGCA 


AAGATGATAC 


CCAATATTAA 


TCCACAAATA 


ACACCGAATG 


TATTCGTACG 


6480 




TTTAGTGAAA 


ATACCAACTG 


CAAATACACC 


AGCCAATGGA 


ACGCCGAATA 


ATCCAGTCAC 


6540 


20 


AAACAAGAAT 


AAATCCCATA 


AGTCATTTGA 


ATTAGAAGCA 


ATTAAGTATA 


GTGACATTCC 


6600 




AAAACCGAAA 


ATACCTGCAA 


TGATAATAAT 


GAAACGTGCA 


AAGTTAACTT 


CGTGTCGCTC 


6660 




GCTACCTTTT 


CCGAAGAAGC 


GTTGCTTAAT 


GTCGATTGAA 


ATACAAG CAG 


ATATAGAATT 


6720 


25 


TAAACTAGAT 


GAAATGGTAG 


ACTGTGCAGC 


GGCGAAAATG 


GCTGCAATAA 


GTAATCCTGC 


6730 




TACAAATGGT 


GGCATCTCAG 


TCAAAATGAA 


ATATGGCACT 


ACAGATGATG 


TATTGAAGCC 


6840 




TTTTGGTAAA 


ACAGCTTCAT 


GTGTATAAAA 


TGAATACAGC 


ATTGTACCCA 


TACCATAAAA 


6900 


30 


TAAGGGTGCT 


GAAATTAAAG 


CTAGGATACC 


ATTTGTCCAT 


AACGATTTAT 


TTGTTTCTTT 


6960 




TAAACTATCA 


GAAGCTTGAT 


AACGCTGCAC 


GACGTCTTGA 


CTCGCTGTGT 


ATTGATACAA 


7020 


35 


GTTGTTGAAA 


ATATTTCCTA 


GGAAAATAAT 


TGGAATGGCA 


GCTGCCGCAG 


TATTTAGTTT 


7080 


CCAATTGTCT 


GCACTAATTA 


ATTTTTTGTG 


CTCAATCGCA 


TCTGCAAAGA 


CAGTGCCGAA 


7140 




ACCG^CTTTA ATGTTCACAA 


CACCTAGAAT 


AATAATAACT 


AAAGCGCCGC 


CTAATAAAAT 


7200 


40 


GACGCCTTGA 


ATGAAATCAC 


TCCAAACCAC 


ACCTTCGAAA 


CCACCTAAAA 


ATGTATATAA 


7260 




AATACATAGT 


AAACCAACGA 


GTGATGCAAC 


GATATAAGGG 


TTCATGTCTG 


ATACAGATGT 


7320 




GATTGCTAAT 


GTTGGTAAGT 


AGATAACAAT 


TGCAACACGC 


CCTAAATGGT 


AAACGACAAA 


7380 


45 


TAATAATGAG 


CCAATGACAC 






GL I a CTAAAT 


ATTCATATGC 


7440 




AGATGTTACC 


TTTAACTTTT 


TAAAGAAAGG 


GACATAGAAA 


TAAATAAGTA 


ATGGAATAAT 


7500 




TGCGACGATA 


GCAATGTTAC 


CAGCGATATA 


TGACCAATCT 


GTTAAAAATG 


CTTTCTCTGG 


7560 


50 


TGTCGACATA 


AATGTAATCG 


CACTTAACGT 


AG TAG CAT AA 


ATTGAAAAGC 


CAACTACCCA 


7620 




AGATGGCAAG 


CGACCACTTG 


CGGTAAAGAA 


ACTATTGGTA 


CTTTGGCTCG 


CGCGCTTGGT 


7680 
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10 



15 



20 



40 



45 



50 



TGTGCCAAAT 


CCAACTTCTT 


TCATGGGCAA 


CATCCCCTTT 


A CAATGT ATT 


GA7TCTTTGA 


7800 


TGTCTATAAA 


TCGTATTTTG 


CAATGAGTTG 


ATCTAATGTT 


TGTCGATGTG 


CTTCGTTAAA 


7860 


AGG7TTGAAA 


GGTCTTTTCG 


GTAATCCTGC 


ATCAATGCCA 


CGATGACGTA 


ATATTTCTTT 


7920 


CAATGTTGGA 


TAAATCCCCA 


TTGATAACAC 


TGTTTCGATA 


ATGTCGTTTG 


AATCATGTTG 


7980 


CAGTTGGTAA 


GCTTCTTGAA 


TTTGACCTTG 


TCGTGCTAAG 


TCGAAGATTT 


TTCTTGCACG 


8040 


GCGACCATTA 


ACGTTATATG 


TAGAACCAAT 


TGCACCATCT 


ACGCCAGAAA 


TCGTAGCTTG 


8100 


AACTAACATT 


TCATCAAAGC 


CAGATAAGAT 


TAATTTGTCT 


GGGAATGCTT 


TTCTAATACG 


8160 


TTCGAGTAGG 


AAGAAGTTTG 


GCGCTGTATA 


TTTAACACCA 


ACAATTTTTT 


CATGATTAAA 


8220 


TAGCTCGCTG 


AATTGTTCAA 


TAGAAATATT 


CACACCTGTT 


AAATCTGGTA 


TTGCATAAAT 


8280 


AATCATATTG 


TTCTGAGTTG 


CTTCGATAAT 


ATCGAAATAG 


TAATCTCTAA 


TTTCTTCAAA 


3340 


AGTAAATGGA 


TAGTAGAATG 


GTGTTACGGC 


AGAAAGTGCA 


TCATAACCGA 


GTTCTGTGGC 


8400 


ATATTTTCCA 


AGTTCAATGG 


CTTCATTTAA 


ATCTAACGAA 


CCTACTTGAG 


CAATCAATTT 


8460 


CACTTTATCC 


CCAACTGCCT 


CTTTGGCAAC 


CTTGAAAACT 


TGCTTCTTCT 


G CT CTG T ATT 


8520 


TAATAAAAAG 


TTTTCGCCTG 


AGCTACCATT 


TACATAAAGA 


CCGTCTAATT 


CTTCAGTTTC 


8580 


AATGGCATTT 


TGAGCAATTT 


GTTTAAGTCC 


TTGTTCATTT 


ACTTGACCAT 


TTTCATCAAA 


8640 


AGGAACGAGT 


AACGCTGCAT 


ATAAACCTTT 


TAAATCTTTG 


TTCATTATGA 


AGTCCCTCCA 


8700 


AAAATCATTT 


GATAATATAG 


TTTACAGCTA 


TAATTGTAAA 


CGCTATCATA 


AAATGTAACA 


8760 


ATATCTTTTT 


GAAAATTGTA 


GTCATATTTA 


TGTATAATTA 


ATGAAAATGT 


TTTTCAAAAT 


8820 


CAATAGAAAT 


GGAGTGAGTA 


AGGTGTATTA 


CATCGCAATC 


GATATTGGAG 


GCACTCAAAT 


8880 


TAAATCGGCA 


GTTATTGATA 


AGCAATTGAA 


T ATG TTTG AC 


TATCAACAAA 


TATCAACGCC 


8940 


GGACAACAAA 


AGTGAGCTTA 


TTACTGACAA 


AGTATATGAG 


ATTGTAACAG 


GATATATGAA 


9000 


GCAATATCAG 


TTGATCCAAC 


CTGTCATAGG 


TATTTCATCA 


GCAGGCGTTG 


TTGATGAACA 


9060 


AAAAGGCGAA 


ATTGTATACG 


CAGGGCCAAC 


CATTCCGAAT 


TATAAAGGTA 


CTAATTTTAA 


9120 


GCGATTATTA 


AAATCACTGT 


CTCCTTATGT 


CAAAGTAAAA 


AATGATGTAA 


ACGCTGCATT 


9180 


ACTAGGCGAA 


TTGAAATTAC 


ATCAATATCA 


AGCAGAACGG 


ATCTTTTGTA 


TGACGCTTGG 


9240 


TACAGGCATT 


GGGGGTGCGT 


ACAAGAATAA 


TCAAGGTCAT 


ATTGATAATG 


GTGAGCTTCA 


9300 


TAAGGCAAAT 


GAAGTTGGGT 


ATTTATTGTA 


TCGTCCAACT 


GAAAATACAA 


CGTTTGAGCA 


9360 


ACGTGCTGCA 


ACGAGTGCAT 


TGAAAAAGCG 


CATGATTGCC 


GGAGGATTTA 


CGAGAAGCAC 


9420 


ACATGTGCCA 


GTATTGTTTG 


AAGCAGCTGA 


AGAAGGTGAT 


G AT ATT GCAA 


AACAAATATT 


9480 
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AGGGCTTATA TTAATT GGGG GCGGTATATC TGAACAAGGA GATAATCTCA TTAAATATAT 96 0 0 

CGAGCCGAAA GTTGCACACT ATTTACCAAA AGACTATGTT TATGCACCAA TACAAACGAC 96 6 0 

5 

TAAGAGTAAA AATGATGCAG CATTATATGG CTGTTTGCAA TGATAGTTGA AAGAAGGAGT 972 0 

CATTCTAAAA TAGAATTTGA AACCGTTACG AGAGATGAGA GCTGTTGTTA GTTCCACACA 978 0 

TCACACTCTA TCTAGGACCA ATCTAAACTA TATCAACCAA cAGTGTGCCA CGGGCAAATT 984 0 

w 

AAATTGAAGA AGCTGAGATA TTAAAATTTT AGAAAATGTA AAAAAATATT TGGTATTGAA 9900 

ATTAAAAAAG CACCTAGCAA CTCGTTGGGA CAATCACGAT GATTGTCTAC AGTTGCAGGT 9960 

?5 GGATTTGAAT ATACTACTAG TTATTTGTTG TCTAGGATAA TAGATTTAGT ATGTTGATAA 10020 

GTTTGACTCA GATTCGTATT TTCTAATAAA TGATAACTCA CGATATCGAT TAAAAAGAGT 100 80 

GTCGCAATTT GTGTGTTGAT AAATTGATGG TCGGTATTAC G CG ATTGATC CGTTGTTAAA 1014 0 

20 AGTACTAAAT CTGCACAATC TGTAAGTTTA CTACCTTCAA AATTTGTGAT GGCAACGACA 10200 

TATGCACCAT GAGATTTGGC GACTTCCGCT GCAGAAATTA ATTCCGAAGT ATTACCACTA 10260 

TTTGACATAG CAATAAACAT ATCCGAATGA GATAGTAGGG ATGCCGATAT TTTCATTAAA 10320 

25 TGTGAATCGG TAGTAACATT ACCTTTTAGC CCCATACGAA T CAT AC GAT A ATAAAATTCA 10380 

GTCGCTGATA AACCAGAGCT ACCTAGTCCA GCAAAGAGTA TATGTCGACT TGATTGAAGT 10440 

TTGTCGATAA AGGTTTGGAT AATGTCGTTA TCAATAAATT CACCAGTTTG TTGAATGATT 10 500 

30 

TGTTGATGAT ATTTATGAAT TCTTTGAATA ATTGGGCTAT TTTCAATAAC TGTCTCTGTC 10 560 

ATTTCTTGTT GAATATTAAA TTTTAAATCT TGGAAATTCT CATAATCCAG CTTATGACTA 10620 

AAGCGTGTCA TCGTTGCTGG TGATGTACCA ATCGCATGGG CTAAGGAGTT AATCGTTGAA 10680 

35 

AAGGCATCGC TATAACCATT TTGTCTTATA TAATTGACGA TGCGTTTATC AGTTTTTGTA 10740 

AATAAATGTT GATAACGTTG AACACGATTC TCAAATTTCA TTGTGTCACC CCTTCATCTT 10 800 

AATGATTACT ATTATATATG AAAAATATTT TCAAGATAGT AAAAAGCATT GATAAAAATT 10 860 

40 

ATCTTAATGA TATATTGTAA ATGACTTTAC GTGAAAAAAC GACTTATGCA GTGAGGAATA 10920 

ATGTTACCAC ATGGATTAAT AGTATCTTGT CAGGCACTAC CAGATGAACC ATTGCATTCA 10 980 

45 TCTTTTATTA TGTCGAAAAT GGCATTAGCT GCGTATGAAG GTGGTGCTGT TGGTATTCGC 11040 

GCAAATACTA AGGAAGACAT TTTAGCAATT AAAGAAACGG TAGATTTACC AGTTATTGGC 1110 0 

ATTGTGAAAC GTGACTATGA TCACTCAGAT GTTTTCATTA CTGCAACGTC AAAAGAAGTT 1116 0 

50 GATGAACTGA TAGAAAGCCA ATGTGAAGTC ATTGCATTGG ATGCAACGTT ACAGCAACGT 1122 0 

CCGAAAGAAA CGTTAGACGA ATTAGTATCA TATATTAGAA CACATGCACC GAACGTTGAA 112 30 
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TATATTGGCA 


CGACGTTACA 


TGGCTATACT 


AGTTATACGC 


AAGGACAATT 


ACTTTATCAA 


11400 




AATGACTTCC 


AATTTTTAAA 


AGATGTACTA 


CAAAGTGTTG 


ATGCAAAAGT 


TATTGCGGAA 


11460 


5 


GGTAATGTCA 


TTACACCGGA 


TATGTATAAA 


CGTGTGATGG 


ACTTAGGCGT 


TCATTGTTCA 


11520 




GTCGTTGGTG 


GTGCGATAAC 


ACGACCAAAA 


GAAATTACGA 


AACGTTTTGT 


TCAAATTATG 


11580 


10 


GAAGA7TAAA 


TGATAACGAT 


AAAAAAACGA 


GATGACCATC 


ATTAATTAAA 


GGCACCTAAT 


11640 


TATCTTAGGT 


GGCTGAATGA 


ATGTAATGGG 


TTCATCTCGT 


TTTGTTTGTT 


TATGATAGTG 


11700 




ATTTTATTTT 


CAACTTTATC 


CAAAAATAAG 


TAAAGCGACG 


GGGATGGTGA 


TTAATAGCGA 


11760 


15 


CAACGCCACG 


CGTAAAAACC 


AAATGATGAT 


GAGTTTCCAG 


ACAGGTATTT 


TAATTTCAGT 


11820 




TGCTAGTATA 


CATGGCACTA 


ATGCTGAGAA 


AAAGATAATG 


GCTGATACGC 


TTACTACACC 


11880 




GACGACAAAT 


TTAGTACTCA 


TTGCAGCTTT 


AGTTACTAAC 


AAAGATGGTA 


GAAACATCTC 


11940 


20 


TACAATAGAA 


AckCTGACGC 


TTTTGCTAGT 


AAAGCCTGAT 


CAGCAATTGG 


GAAAATATAA 


12000 




ATAAATGGAT 


AGAAGATATA 


GCCAAGCCAA 


TCAATGAATG 


GTGTATAGTT 


CGCTACAA7C 


12060 




AGTCCTAAAA 


AACCAATCGA 


TAATATAGAA 


GGTAAAATAC 


CAACAGTCAT 


TTCTAAACCG 


12120 


25 


TCTTTCAAAT 


TGTCCCAAAC 


GTTCTTCACG 


AGAGATGGTG 


TTAATGCATT 


TTGTTTCATC 


12180 




GCCTCTGCAT 


ATG CAGTTTT 


CAGTCTGCTT 


CCTTCAATAG 


CAACTTCTTG 


TTCTCCTTCT 


12240 




TGTCCGTTAT 


AATATTCTGT 


TGATTCATTG 


CTGATTGGCG 


GTAGCCATGC 


AGTAATTGCA 


12300 


30 


GTCACGACAA 


ATGTGATGAC 


TAAAGTTATC 


CAAAAGTATA 


AATTC CAATG 


CGGCATTAAT 


12360 




CCTAAAGTTT 


TAGCAACGAT 


AATCATAAAA 


GTTGCTGAAA 


CTGTTGAAAA 


GCCAGTCGCA 


12420 


35 


ATAATCGTGG 


CTTCTCGTTT 


GTTGTACATC 


CCTTGCTTAT 


AGACACGATT 


AGTAATCAAT 


12480 


AATCCTAAGG 


AATAACTGCC 


GACAAACGAA 


GCCACTGCAT 


CGACAGCGGA 


TTTTCCTGGT 


12540 




GTTTTAAAAA 


TAGGTCTCAT 


AATAGGCTCC 


ATATAAACAC 


CGACAAATTC 


TAATAAGCCA 


12600 


40 


TAGCCCACTA 


ATAAAGAAAG 


CGcAATTGCA 


CCTACTGGAA 


TTAAGATACT 


TAATGGCATC 


12660 


ATTAATTTTT 


CAAACAAAAA 


CGGACCATAG 


TTAGCTTTAA 


ATAGTATTGA 


TGGACCGATT 


12720 




TTAAATACAT 


ACATTATACC 


GATCATTGCA 


CCTGCAACTT 


TAAATAATGT 


AATGACCAAG 


127B0 


45 


TTTGTGATTG 


AAGTCATAAA 


AGTACGTCTC 


ACTATTGGTA 


ACGCTGTACC 


AATTAAAATC 


12840 




ATAATCAGTG 


CAACATAGGG 


CATAAGTGGA 


CCTATGATTG 


AGCGAATGGC 


TAGATGAACA 


12900 




TGATCGACGA 


AAATAGTGTT 


GTTACCATTA 


ATCG T AAAAG 


GAATAAAGAA 


A C ATAGTATG 


12960 


50 


CCCACTAAAC 


TATAGACAAA 


AAAACGCCAT 


GCACTTGGTT 


GTTGTGCATT 


AGAATGATAT 


13020 




TGATTCATTA 


AAGCAACCCC 


TTTGTTTAAA 


TGAATACACA 


AAACTGTATG 


ATGCATCTTC 


13080 
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ATAGTTTGAA TTATTTTCAT ACCAATACAA ATTAACTAAT TATATATAGA TTGAAACTAT l32 00 

ATTACTTAAT AAAATATTTA TCTTAAATGT TGTTGTGTTG ATTCAACACC ACAACTAAAA 13 26 0 

GTGTTTATAA ATTATTTGGA AATACACATA TTTGTAAATG ATTAGTATCG ATTTAATATC 13 3 20 

GTATTATTAA ATTTTTATTA ATTTTGTAGT CTTAATCmAA AAATAATATA TGTCATGTTA 13380 

TATTGAAGGT GCAGTTGTTT TTCATTCTCA AGAGGGGGTC AAAAAAATAC TTTTGAGGTG 13440 

W 

ATTATATGTT AAGAGGACAA GAAGAAAGAA AGTATAGTAT TAGAAAGTAT TCAATAGGCG 13 500 

TGGTGTCAGT GTTAGCGGCT ACAATGTTTG TTGTGTCATC ACATGAAGCA CAAGCCTCGG 13 560 

^ AAAAAACATC AACTAATGCA GCGGCACAAA AAGAAACACT AAATCAACCG GGAGAACAAG 13620 

GGAATGCGAT AACGTCACAT CAAATGCAGT CAGGAAAGCA ATTAGACGAT ATGCATAAAG 13680 

AGAATGGTAA AAGTGGAACA GTGACAGAAG GTAAAGATAC GCTTCAATCA TCGAAGCATC 13 740 

20 AATCAACACA AAATAGTAAA ACAATCAGAA CGCAAAATGA TAATCAAGTA AAGCAAGATT 13 800 

CTGAACGACA AGGTTCTAAA CAGTCACACC AAAATAATGC GACTAATAAT ACTGAACGTC 13 860 

AAAATGATCA GGTTCAAAAT ACCCATCATG CTGAACGTAA TGGATCACAA TCGACAACGT 13 920 

25 CACAATCGAA TGATGTTGAT AAATCACAAC CATCCATTCC GGCACAAAAG GTAATACCCA 13 98 0 

ATCATGATAA AG C AG CAC CA ACTTCAACTA CACCCCCGTC TAATGATAAA ACTGCACCTA 14 04 0 

AATCAACAAA AGCACAAGAT GCAACCACGG ACAAACATCC AAATCAACAA GATACACATC 1410 0 

50 AACCTGCGCA TCAAATCATA GATGCAAAGC AAGATGATAC TGTTCGCCAA AGTGAACAGA 14160 

AACCACAAGT TGGCGATTTA AGTAAACATA TCGATGGTCA AAATTCCCCA GAGAAACCGA 14 2 20 

CAGATAAAAA TACTGATaAT AAACAACTAA TCAAAGATGC GCTTCAAGCG CCTAAAACAC 142 80 

GTTCGACTAC AAATGCAGCA GCAGATGCTA AAAAGGTTCG ACCACTTAAA GCGAATCAAG 14340 

TACAACCACT TAACAAATAT CCAGTTGTTT TTGTACATGG ATTTTTAGGA TTAGTAGGCG 14 4 00 

ATAATGCACC TGCTTTATAT CCAAATTATT GGGGTGGAAA TAAATTTAAA GTTATCGAAG 14460 

-JO 

AATTGAGAAA GCAAGGCTAT AATGTACATC AAGCAAGTGT AAGTGCATTT GGTAGTAACT 14 520 

ATGATCG CGC TGTAGAACTT TATTATTACA TTAAAGGTGG TCGCGTAGAT TATGGCGCAG 14 580 

^ CACATGCAGC TAAATACGGA CATGAGCGCT ATGGTAAGAC TTATAAAGGA ATCATGCCTA 14 64 0 

ATTGGGAACC TGGTAAAAAG GTACATCTTG TAGGGCATAG TATGGGTGGT CAAACAATTC 14 700 

GTTTAATGGA AGAGTTTTTA AGAAATGGTA ACAAAGAAGA AATTGCCTAT CATAAAGCGC 14 760 

50 ATGGTGGAGA AATATCACCA TTATTCACTG GTGGTCATAA CAATATGGTT GCATCAATCA 14 820 

CAACATTAGC AACACCACAT AATGGTTCAC AAGCAGCTGA TAAGTTTGGA AATACAGAAG 14 88 0 
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ATT7AGGATT AACGCAATGG GCCTTTAAAC AATTACCAAA TGAGAGTTAC ATTGACTATA 15000 

TAAAACGCGT TAGTAAAAGC AAAATTTGGA CAT CAG A CG A CAATGCTGCC TATGATTTAA 1506 0 

CGTTAGATGG CTCTGCAAAA TTGAACAACA TGACAAGTAT GAATCCTAAT ATTACGTATA 1512 0 

CGACTTA7AC AGGTGTATCA TCTCATACTG GTCCATTAGG TTATGAAAAT CCTGATTTAG 1518 0 

GTACATTTTT CTTAATGGCT ACAACGAGTA GAATTATTGG TCATGATGCA AGAGAAGAAT 15240 

GGCGTAAAAA TGATGGTGTC GTACCAGTGA TTTCGTCATT ACATCCGTCC AATCAACCAT 153 00 

TTGTTAATGT TACGAATGAT GAACCTGCCA CACGCAGAGG TATCTGGCAA GTTAAACCAA 153 60 

TCATACAAGG ATGGGATCAT GTCGATTTTA TCGGTGTGGA CTTCCTGGAT TTCAAACGTA 15420 

AAGGTGCAGA ACTTGCCAAC TTCTATACAG GTATTATAAA TGACTTGTTG CGTGTTGAAG 15480 

CGACTGAAAG TAAAGGAACA CAATTGAAAG CAAGTTAAAT TCATCTTCTG AATTTAATAT 15540 

GCTATGTAAA TCGTGCTGTT ATCATGGCAC ATCAGATATA AG TAG CAT CA CAGTGTTGAA 15600 

TTTAAAAATA GTAAAGTGAA ATAAAGCGCC TGTCTCATTA GCGAAAACTA AAGGGACAGG 15660 

CGTATCTGTT TATGAGCTTA ATAAATTGTA TGAATAATAT GGTTGATCGA ATAACTGTTT 15720 

ATCATGATGA TAAATTGAGT TTTTTAAAAT AATGATATAT TACATCATTG TTATAGCGTT 15780 

TAAGAAATCA ACAACTTTAC GATAAATAGT GATTGCTTCG TCATTAGGTC TACGATCAAA 15 840 

ATCATGCTCG TTTTTATTCA CGCGTTCAAA TGTTGAATGT GGAACATGAT TCATGATATG 15 900 

TTCGCTTTCC TCAACGGGAA CATCATAATC GCCATTACAA TGCGCAATGA AAACAGGTGG 15 960 

AAGTGTTTTA AGTTCATCTG GTGCAATATT ATATTTTGAA TTAGTATAAT CAGCAATGTT 16 020 

AATCATATTT ATCCATTTAC CTGTGCCACG TGCATAAACG TAGATTAAAA AACGTTGTGC 16 080 

GATTTGATCT TGAACAACCG GTGTTGGTGA AGTGAGTTGT GCAATCATTG TTTCGTTTAC 1614 0 

GCTTTGAGCT ATTTTTGCGT AATAACTATT AGTTGTTTTA AAAGGTTCAG TGTTGATGCG 16200 

ACTATAACCA TAAAAATCAA TAACACCATC AATATCTCTG TCTCGTGCAA TTAATAGACT 16260 

TAAATATGCA CCTGATGATC TGCCAAAGGT AAAAATAGGG CAATTAGAAT ATTGTGATTG 16 320 

AATCGCATCG AATGAtGCgn AGnACATCCT CAATAATGCA ATCGAGACTT ACTTCTGGTA 16 380 

ATAAACGATA ACTTAGTTGA ATTAAATCGT AATGTTCCGT AAgATATCGA TATACTGTGG 16440 

GGATAAATCG TTAGCTTTAC CGAACATTAA TCCACCACCG TGGATGTAGA CAATAGCGCC 16 500 

TTTTGTTGGT TGATTTTTTG CTTTAATAAT TGTGTAAGGT AATGCAAATG CATCTTTAGT 16 560 

AATTACTTTA TCTTTAATTT CAGTCACGAT TTAATAGGCT CCTTATTTTT GATATTGATG 16 62 0 

TCATTATAAC ACTGTCTTAA ATTTCCATGA AAAATAGTCT TAAGACGATG AGTCATGATA 16680 
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CATCATTTTA ACAATATCTT TAAAAGCAGC ATGTGGAATG GCTAAATCTT CTAAATCTGC 168 00 

CATAGAAAAT TCAAGATTGA TATCATGTGG TCGCTGTTCA GCAAGTTTAT GCACAAAGTC 16 860 

5 

AGGTTCTG7G ACAAAAGGCG AAGACATGCC GACCATATCT GCATGTTGTA AAGCATCTAA 16 92 0 

AGCAGACTCT GGAGAATTAA TCCCGCCACT TGCAATTAAA GGGATACGAC CTGCTAAATG 16 980 

TTCATAGACA ATTTGGTTAA CTGGTCGACC GAAATGATCA CCTGGTGTAC GAGACGTATT 17040 

10 

TTGATAAATA TGTCGACCCC AGCTAGCGAT TGCTAAGTAT TGGATGTTTG AAACGTCCAT 17100 

GACCCAATTG ATTAATTGGT TGAACTCGTC AATGGTATAT CCTAAATCAC TGCCTCTGGT 1716 0 

/5 TTCTTCTGGC GTTGCTCGAA ATCCTAAAAT AAAATTGTCA GGTGCTTCTT TATCAATCAC 17220 

TTCTTGTACC GCACGCATAA CTTCTAAACA TAATCTTGCA CGATTTTTTA ATGAGTCGGC 17280 

ACCGTAATGG TCTGTACGTT TATTCGAAAA AGTTGAGAAA AATGTTTGAA TCAGCAAACG 1734 0 

20 TTGTGCAATC GAAATTTCCA CACCATCAAA ACCTGCTTTA ATCGCGCGTA ATGTAGCATC 17400 

GCGATACTGC TGAATGATGC TATTGATTTT CTCATGAGAC ATGGCGATAA CATCGTGTTC 17460 

AATCGGTGAA TGCAATGTCA TAGGGCTTGG TCCATACACC TTTCCAAAAT TTAAAATGGC 17 520 

25 TTGATTTGAA AAACGACCAG CATGCGCTAg CTGGATAATA GCGAGGCTAC CATGTTGTTT 17580 

CATCGTAGAT GCCATGTTAG TTAATCCAGG GATACAAGCA TCATGATCAA TATTAAAGCC 17 64 0 

ATATTCAAAC AATTGACCAT AAGGTTCAAT GTAAGCAGCG CCGGTGACTT GCATTCCAGC 17700 

30 

TGAATTAGAG CGACGTG CAG CATAAGCCAA GTCTTCTTTT GTAATATAGC CTTCTTTTGT 17 760 

TGATGTGTTT ACGGTCATTG GTGATAATAC AAAGCGATTC GAAATTTTGA TGCCATTAGG 17 82 0 

TAAGTGGATT GATTGTAAAA GTGGTTTGTA TCGGTACATA CTATGATTCC TTTTCTATTC 1788 0 

35 

AATATTGTTT TCAAAGTACC ATGGAAAGAA TGAATAATCA ATGATGAACA GTCTTGATAG 17 940 

AATAGAATTG GTACATGGAA AGTATTTTTA AAATTAAACT AATGAATGGC ATTTGTAGGT 18000 

CTGAAAATAT GAATATGAAA AAGAAAAATA AAGGCGAAAA GATATAAAAG TTAATTGAAA 18 060 

40 

AACGTTATCA TAT A CGTGGG TATATGAAGA GGGAATGGTA TTAAGAACGC TAAAATGTTA 1812 0 

TGTCGGTTTG ACATGACAGG ATAAGTTTGG AGATGACGGA TTGGTTAAAT TAAGCGTATT 1818 0 

45 AGACTATGCC TTAATAGATG AAGGTAAGGA TGCACAAAAG GCATTGCAAG ATTCAGTGAC 18240 

ACTTG CAAAA TTAGCAGATC GACTTGGCTT TAAGCGAATT TGGTTTACGG AACATCATAA 18300 

TGTACCAGCG TTTGCGTGTA GTAGTCCAGA ACTTTTGATG ATGCATACAT TGGCGCAGAC 18360 

50 AAATCA CAT A CGAGTTGGCT CTGGTGGTGT GATGCTGCCG CACTATCGAC CTTATAAAAT 18420 

TGCTGAGCAT TTTAGAATGA TGGCAGCGTT AT AT C CAAAT CGTATTGATT TAGGTATTGG 18480 
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TAGTTACGAT GAATCGATTT CGTTATTACG TGATTATCTT ACAATAAAGG AT AAA C CAAG 18600 

TGCGCATACG TTAGGTGTCC AACCACACAT TGATCATTTT CCAGAAATGT GGTTATTAAG 18660 

TAGTAGCGCA ACATCTGCCA AAATAGCTGC CGAACTAGGT ATAGGGCTTT CTGTTGGAAC 1872 0 

ATTTTTGCTA CCAGATATAA ATGCGATACA TACAGCGAAG GATAACATTG ATATTTACAA 18780 

AAAACATTTC CAAG CAT CAA CGATTAAAAT GGACGCAAAG GTGATGGCAT CTGTATTTGT 18640 

CATTGTAGCT GATAACGAAG CGGAAGTAGC AGCATTACAA CATGCCTTAG ATGTTTGGTT 18900 

ATTAGGTAAA TTACAATTTG CAGAATTTGA AGATTTTCCT TCAGTAGACA CAGCACAAAA 18960 

GTATAAGCTT AATGATCGAG ACAAAGAGAT GATTCAAGCA CATCAAGCAC GCATCATTGC 19020 

AGGTACACAA GAAAAGGTTA AAGCACAATT AGATGATTTC ATTGCTACGT TTGAAGTTGA 19080 

TGAGGTGTTA GTAGCACCGC TTATTCCAGG TATTGAACAG CGTTGTAAAA CATTAAAATT 19140 

ACTCGCGGAA ATTTATTTGT AGCATTTTAA ATAGAAGAGA AAGGATGAAG ATAAGATGAA 19200 

AAAGTTAGCC AATTATTTAT GGGTAGAAAA AGTAGGAGAT TTGTATGTGT TTAGTATGAC 19260 

ACCTGAATTG CAAGATGATA TTGGGACAGT AGGTTATG TT GAATTCGTAA GTCCAGATGA 19320 

AGTTAAAGTG GATGATGAAA TTGTGAGTAT CGAAGCATCG AAAACGGTCA TTGATGTGCA 19380 

AACGCCATTG TCAGGAACGA TTATTGAGCG AAATACAAAA GCGGAAGAAG AACCGACAAT 19440 

TTTAAACTCT GAAAAACCAG AAGAAAATTG GTTGTTCAAA TTGGATGATG TCGATAAAGA 19500 

AGCATTCCTA GCATTACCGG AGGCTTAAAT GGAAACGTTA AAATCAAATA AAGCGAGACT 1956 0 

TGAATATTTA ATCAATGATA TGCATCGAGA GAGAAATGAC AATGACGTAT TGGTAATGCC 1962 0 

ATCTTCATTT GAAGATTTGT GGGAATTATA TCGAGGCTTA GCAAATGTCA GACCGGCATT 19680 

ACCTGTAAGT GATGAATATT TAGCTGTACA AGATGCTATG TTAAGTGATT TGAATCGTCA 1974 0 

ACATGTTACG GATTTGAAGG ATTTGAAGCC GATAAAAGGT GACAATATCT TTGTTTGGCA 19800 

AGGTGATATC ACGACGTTAA AAATCGATGC TATTGTTAAT GCTGCAAATA GTCGTTTTCT 19860 

AGGATGTATG CAAGCTAATC ATGACTGCAT TGATAATATT ATTCATACAA AAGCGGGTGT 19920 

TCAAGTTCGA CTTGATTGTG CAGAGATCAT TCGACAACAA GGGCGCAATG AAGGTGTAGG 19980 

TAAAGCCAAA ATAACACGTG GATATAATTT GCCAGCAAAG TATATAATTC ATACGGTTGG 20040 

TCCGCAAATA CGTCGATTGC CTGTTTCAAA GATGAATCAG GACTTGTTAG CTAAATGTTA 2 0100 

TCTTAGCTGT CTTAAATTGG CTGATCAACA TAGTTTAAAT CATGTCGCTT TTTGCTGTAT 2 016 0 

ATCTACAGGT GTATTTGCTT TTCCTCAAGA TGAAGCAGCA GAAATTGCTG TTCGAACAGT 2 0220 

AGAAAGCTAT CTCAAAGAAA CAAATTCAAC ATTGAAAGTC GTGTTCAATG TATTTACAGA 2 0280 
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CAATGTCTCT GTTAATGGAT GACAAGACAA AGCAGGCTGA AGTATTGCGT ACTGCGATTG 2 0400 

ATGAAGCAGA TGCGATAGTG ATTGGAATTG GTGCAGGCAT GTCTGCATCT GACGGATTTA 2 0460 

CA7ATGTAGG AGAGCGTTTT ACGGAAAATT TCCCAGATTT TATTGAAAAA TATCGCTTCT 2 0 520 

TTGATATGTT GCAAGCGAGT TTACATCCTT ATGGCAGTTG GCAAGAGTAT TGGG CATTTG 2 05B0 

AGAGTCGTTT TATTACATTA AACTATTTAG ATCAACCTGT AGGTCAGTCT TACCTCGCTT 2 0640 

TAAAATCCTT GGTGGAAGGT AAACAGTACC ACATTATAAC TACGAATGCA GATAATGCTT 20700 

TCGATGTAGC TGATTATGAT ATGACTCATG TATTTCATAT ACAAGGGGAG TATATACTGC 2 0760 

AACAGTGTAG CTCAGCATTG TCATGCTCAA ACGTATCGCA ATGATGATTT AATT CGTAAA 2 0 820 

ATGGTTGTTG CGCAACAAGA TATGCTTATA CCTTGGGAGA TGATTCCAAG ATGTCCAAAA 2 08 80 

TGTGATGCCC CAATGGAAGT GAATAAACGT AAAGCGGAAG TTGGGATGGT TGAAGATGCT 2 0 940 

GAATTTCATG CGCAACTACA TCGTTATAAT GCTTTTCTAG AGCAACATCA AGATGATAAA 21000 

GTGTTGTATT TGGAAATTGG AATTGG TT AT ACTACACCAC AATTTGTGAA GCATCCTTTT 21060 

CAGCGTATGA CACGTAAAAA TGAAAATGCC C TTTATATG A CGATGAATAA AAAGGCATAT 2112 0 

CGCATTCCGA ATTCAATTCA AGAACGTACC A TA CATTTAA CTGAGGATAT CTCAACATTG 21180 

ATTACAGCAG CACTCCGGAA CGACAGCACA ACGAAAAATA ACAACATTGG AGAGACAGAA 21240 

GATGTACTTA AT AG AA C C G A TTAGAAATGG AGAATATATT ACTGATGGTG CGATTGCACT 21300 

CGCTATGCAA GTTTATGTTA ACCAGCATAT CTTTTTAGAT GAAGATATTT TATTCCCTTA 213 60 

TTATTGTGAT CCAAAAGTGG AAATTGGACG TTTTCAAAAT ACTGCTATAG AAGTGAATCA 214 2 0 

AGATTATATA GATAAACACA GTATTCAAGT AGTTCGCCGA GATACTGGTG GTGGCGCTGT 214 80 

GTATGTTGAT AAAGGTGCCG TTAATATGTG TTGTATTTTA GAACAAGACA CTTCAATTTA 21540 

TGGT5ATTTT CAACGATTTT ATCAACCAGC TATAAAGGCG TTGCATACAT TAGGTGCAAC 216 00 

AGATGTGGTA CAAAGCGGTA GAAATGATTT AACATTGAAT GGTAAAAAAG TGTCAGGCGC 216 6 0 

CGCAATGACA TTAATGAATA ATCGTATTTA TGGCGGTTAT TCGCTATTAC TTGATGTTAA 21720 

TTATGAAGCA ATGGATAAAG TGTTAAAGCC TAATCGCAAA AAGATTGCAT CGAAAGGGAT 21780 

TAAATCTGTG CGCGCACGTG TTGGTCATCT TAGAGAAGCA CTGGATGAAA AGTATCGTGA 21840 

TATAACCATT GAAGAATTTA AAAATTTAAT GGTGACGCAG ATTTTGGGAA TCGATGACAT 21900 

TAAAGAGGCG AAACGATATG AATTAACGGA TGCAGATTGG GAAGCGATTG ATGAATTAGC 21960 

TGATAAAAAG TATAAAAATT GGGATTGGAA TTATGGCAAG TCACCCAAAT ATGAATACAA 22 020 

TCGAAGTGAA AGATTATCTT CAGGTACGGT AGACATAACA ATTTCTGTTG AACAAAATCG 22 080 
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AGAAGCATTA CAAGGAACAA AAATGACAAG AGAAGATTTA ACGCATCAGT TAAAGCAATT 2 2200 

AGACATCGTT TATTATTTTG GCAATGTTAC GGTAGAAGCA TTAGTGGATA TGATTTTAAG 222 6 0 

5 TTAATATTGT TATTTTATGT ATGCTGAATC ATTGGAAGTG TTTGCTTGCT CTTGAAAAGG 22 32 0 

TGACAATAGT GTTTGGTGAA GGTTGAACAT ATGAGTGGAA ATTATTGCCT TTAACTATTC 223 8 0 

AAAGTATGAT ATATATATGG TTTTTGTTTC TAAATGATTG GGTATTTGAA AATAGATGAG 22440 

10 

TTTAATATTT 7AAGGAATAT AATGATGTTT ACTTTTATAA TTCATATAGA ATATTAAGCA 2250 0 

ATATAAGTCT GTTGATATAT ACAAAATATA ATGACTGCTA TAATGAGTAA TCAATAGACA 2256 0 

CAAAGAGGAG ATTATGTGAT GAATAATAAA GTATTAGTAA CCGGTGGTAC AGGGTTTGTT 22620 

15 

GGCATGCGAA TTATTTCACG ATTATTAGAA CAAGGTTATG ACGTACAAAC GACGATACGT 226 8 0 

GATTTAAGTA AAGCTGATAA AGTAATTAAA ACAATGCAAG ACAATGGCAT TTC CACAG AG 22740 

20 CGATTAATGT TTGTCGAAGC GGATTTATCA CAAGATGAAC ATTGGGATGA AGCAATGAAA 22 800 

GATTGCAAGT ATGTCTTGAG TGTAGCATCT CCGGTGTTTT TCGGTAAAAC AGACGATGCA 22 860 

GAAGTGATGG CGAaCTGcAA TTGAAGGTAT ACAACGTATT TTAAGAGCTG CAGAACATGC 22920 

25 GGGTGTTAAA CGTGTGGTAA TGACTGCAAA CTTTGGTGCA GTTGGTTTTA GTAATAAAGA 22980 

TAAAAATTCA ATCACAAATG AAAGTCATTG GACAAATGAA GATGAACCAG GCTTATCAGT 2 3 040 

ATATGAAAAA TCAAAATTGT TAGCTGAAAA GGCAGCGTGG GATTTTGTTG AGAATGAAAA 2 3100 

30 TACAACAGTA GAATTTGCCA CAATCAATCC AGTTGCAATT TTTGGGCCAT CATTAGATGC 2 3160 

ACACGTTTCA GGAAGCTTTC ATTTATTAGA AAATTTATTG AATGGTTCAA TGAAACGTGT 2 3 220 

ACCGCAAATT CCGTTAAATG TTGTTGATGT GAGAGACGTA GCTGAACTGC ACATTTTGGC 232 BO 

35 

AATGACAAAT GAACAAGCTA ATGGCAAGCG ATTTATTGCG ACGGCTGATG GACmAATTwA 2 3340 

tTTGTTGGGA ATTGcCAAAt TAATTAAAGA AAAGGGCCTG GAAATAGCTC CAAAAGTTCC 234 00 

TACTAAAAAA TTACCCAGCT TTATTTTGAG CnAnGnGCC 234 3 9 

40 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4522 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 39; 

CCCTTTGAGA GTATATCATC TAGTCAAATT ATGCCTGTCA TTAGAGCGAC TAGCTTTGAT 60 

55 



352 



EP0 786 519 A2 



5 



10 



IS 



20 



40 



45 



50 



T ATT AT G CAS 


TCGATTTAGG 


GAAATCATAT 


CGTCTAATTG 


ACGAAAGCAT 


GTTAGAGGAT 


180 


7TGAAGTTAA 


CTGAACAACA 


AATAAGAGAA 


ATGTCTCTGT 


TTAATGTTAG 


AAAATTGTCA 


240 


AATTCATATA 


CGACTGATGA 


AGTAAAAGGT 


AATAlTViTT 


ATTTTATTAA 


CTCAAATGAC 


300 


GGGTATGATG 


CAAGTAGGAT 


ACTAAATACT 


GCATTTTTAA 


ATGAAATTGA 


GGCACAATG7 


360 


CAAGGCGAAA 


TGCTCGTAGC 


AGTCCCACAC 


CAAGATG7GT 


TAATTATTGC 


AG AT AT ACGC 


420 


AATAAAACAG 


GATATGATGT 


GATGGCACAT 


TTAACAATGG 


AATTTTTCAC 


TAAAGGTCTA 


490 


GTTCCAATTA 


CATCATTATC 


CTTTGGATAT 


AAACAGGGTC 


ATCTTGAACC 


GATATTTATT 


540 


TTAGGTAAAA 


ATAATAAACA 


AAAAAGAGAT 


CCAAACGTGA 


TTCAGCGTTT 


AGAAGCAAAT 


600 


CGTCGTAAAT 


TTAATAAAGA 


TAAATAGAAA 


TAATTGGATA 


AGGAGTTTTG 


TCATAATGAA 


660 


TTTATTTTAC 


AATCCTAAAT 


ATGTAGGAGA 


TGTCGCATTT 


TTACAAATTG 


AACCAGTTGA 


720 


AGGTGAATTA 


AACTACAATA 


AAAAAGGTAA 


TGTTGTTGAA 


ATTACtAATG 


AAGGTAATGT 


780 


TGTAGGTTAT 


AATATTTTTG 


AAATTTCAAA 


AGATATAACA 


ATTGAAGAAA 


AAGG T CAT AT 


840 


TAAATTAACT 


GATGAACTTG 


TAAATGTATT 


CCAAAAGCGT 


ATTTCAGAAG 


CTGGTTTTGA 


900 


TTATAAATTA 


AATGCTGATC 


TATCACCGAA 


ATTTGTAGTT 


GGCTACGTTG 


AAACTAAAGA 


960 


CAAACATCCT 


GATGCAGATA 


AATTAAGTGT 


ACTAAATGTA 


AACGTTGGAA 


ATGACACATT 


1020 


ACAAATTGTA 


TGTGGCGCGC 


CTAACGTTGA 


AGCTGGACAG 


AAAGTTGTTG 


TTGCTAAAGT 


1080 


AGGTGCAGTG 


ATGCCTAGCG 


GTATGGTAAT 


TAAAGATG C T 


GAATTACGTG 


GTGTTGCCTC 


1140 


AAGCGGTATG 


ATTTGTTCAA 


TGAAAGAATT 


GAATTTACCT 


AATGCACCTG 


AAGAAAAAGG 


1200 


TATTATGGTA 


TTAAATGACA 


GCTATGAAAT 


TGGACAAGCA 


TTtTTTGAAT 


AATTAAGGAA 


1260 


GGTAGTGAAA 


ATATG AG CTG 


GTTTGATAAA 


TTATTCGGCG 


AAGATAATGA 


TTCAAATGAT 


1320 


GACTTGATTC 


ATAGAAAGAA 


AAAAAGACGT 


CAAGAATCAC 


AAAATATAGA 


TrACGATCAT 


1380 


GACTCATTAC 


TGCCTCAAAA 


TAATGATATT 


TATAGTCGTC 


CGAGGGG AAA 


ATTCCGTTTT 


1440 


CCTATGAGCG 


TAGCTTATGA 


AAATGAAAAT 


GTTGAACAAT 


CTGCAGATAC 


TATTTCAGAT 


1500 


GAAAAAGAAC 


AATACCATCG 


AGACTATCGC 


AAACAAAGCC 


ACGATTCTCG 


TTCACAAAAA 


1560 


CGACATCGCC 


GTAGAAGAAA 


TCAAACAACT 


GAAGAACAAA 


ATTATAGTGA 


ACAACGTGGG 


1620 


AATTCTAAAA 


TATCACAGCA 


AAGTATAAAA 


TATAAAGATC 


ATTCACATTA 


C CAT ACGAAT 


1680 


AAGCCAGGTA 


CATATGTTTC 


TGCAATTAAT 


GGTATTGAGA 


AGGAAACGCA 


CAAGCCAAAA 


1740 


ACACATAATA 


TGTATTCTAA 


TAATACAAAT 


CATCGTGCTA 


AAGATTCAAC 


TCCAGATTAT 


1800 


CACAAAGAAA 


GTTTCAAGAC 


TTCAGAGGTA 


CCGTCAGCTA 


TTTTTGGCAC 


AATGAAACCT 


1860 
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AAACAAAAAT ATGATAAATA TGTAGCTAAG ACGCAAACGT CTCAAAATAA ACAATTAGAA 198 0 

CAAGAAAAAC AAAATGATAG TGTTGTCAAA CAAGGAACTG CATCTAAATC ATCTGATGAA 2 34 0 

5 AATGTATCAT CAACAACAAA ATCAATGCCT AATTATTCAA AAGTTGATAA TACTATCAAA 210 0 

ATTGAAAATA TTTATGCTTC ACAAATTGTT GAAGAAATTA GACGTGAACG AGAACGTAAA 216 0 

GTGCTTCAAA AGCGTCGATT TAAAAAAG CG TTGCAACAAA AGCGTGAAGA ACATAAAAAC 222 0 

?0 GAAGAGCAAG ATGCAATACA ACGTGCAATT GATGAAATGT ATG CTAAACA AGcGGAACgC 2 2 80 

TATGTTGGTG ATAGTTCATT AAATGATGAT AGTGACTTAA CAGATAATAG TACAGATGCT 234 0 

AGTCAGCTTC ATACAAATGG CATAGAGAAT GAAACTGTAT CAAATGATGA AAATAAACAA 24 00 

IS 

GCGTCAATAC AAAATGAAGA CACTAATGAC ACTCATGTAG ATGAAAGTCC ATACAATTAT 24 6 0 

GAGGAAGTTA GTTTGAaTCA AGTATCGACA ACAAAACAAT TGTCAGATGA TGAAGTTACG 2 520 

GTTTCGAATG TAACGTCTCA ACATCAATCA GCACTACAAC ATAACGTTGA AGTAAATGAT 2580 

20 

AAAGATGAAC TAAAAAATCA ATCCAGATTA ATTGCTGATT CAGAAGAAGA TGGAGCAACG 264 0 

aATAAAGAAG AATATTCAGk AAGTCAAATC GATGATGCAG AATTTTATGA ATTAAATGAT 2700 

2& ACAGAAGTAG ATGAGGATAC TACTTCAAAT ATCGAAGATA ATACCAATAG AAACGCGTCT 2 760 

GAAATGCATG TAGACGCTCC TAAAACGCAA GAGTACGCAG TAACTGAATC TCAAGTAAAT 2B2 0 

AATATCGATA AAACGGTTGA TAATGAAATT GAATTAGCAC CGCGTCATAA AAAAGATGAC 2 8 80 

30 CAAACAAACT TAAGTGTCAA CTCATTGAAA ACGAATGATG TGAATGATAA TCATGTTGTG 2 94 0 

GAAGATTCAA GCATGAATGA AATAGAAAAG AATAACGCAG AAATT A CAG A AAATGTGCAA 3 0 00 

AACGAAGCAG CTGAAAGTGA ACAAAATGTC GAAGAGAAAA CTATTGAAAA CGTAAATCCA 3 060 

35 AAGAAACAGA CTGAAAAGGT TTCAACTTTA AGTAAAAGAC CATTTAATGT TGTCATGACG 3120 

CCATCTGATA AAAAGCGTAT GATGGATCGT AAAAAGCATT CAAAAGTCAA TGTGCCTGAA 3180 

TTAAAGCCTG TACAAAGTAA GCAAGCTGTG AGTGAAAGAA TGCCTGCGAG TCAAGCCACA 324 0 

40 CCATCATCAA GATCTGATTC ACAAGAGTCA AATACAAATG CATATAAAAC AAATAATATG 3 3 00 

ACATCAAACA ATGTTGaGAA CAATCAACTT ATTGGTCATG CAGAAACAGA AAATGATTAT 3 3 60 

CAAAATGCAC AACAATATTC AGAGCAGAAA CCTTCTGTTG aTTCAACTCA AACGGAAATA 3 4 20 

45 

TTTGAAGAAA GTCAAGATGA TAATCAATTG GAAAATGAGC AAGTTGATCA ATCAACTTCG 34 8 0 

TCTTCAGTTT CAGAAGTAAG CGACATAACT GAAGAAAGCG AAGAAACAAC ACATCCAAAC 3 54 0 

AATACTAGTG GACAACAAGA TAATGATGAT CAACAAAAAG ATTTACAGTC ATCATTTTCA 3 6 00 

50 

AATAAAAATG AAGATACAGC TAATGAAAAT AGACCTCGGA CGAACCAACA AGATGTTG CA 3 6 60 

55 
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CCAAGTGTTT CATTACTAGA AGAACCACAA GTTATTGAGT CGGACGAGGA CTGGATTACA 3 78 0 

GATAAAAAGA AAGAACTGAA TGACGCATTA TTTTACTTTA ATGTACCTGC AGAAGTACAA 384 0 

GATGTAACTG AAGGTCCAAG TGTTACAAGA TTTGAATTAT CAGTTGAAAA AGGTGTTAAA 3900 

GTTTCAAGAA TTACGGCATT ACAAGATGAC ATTAAAATGG CATTGGCAGC GAAAGATATT 396 0 

CGTATAGAAG CGCCTATTCC AGGAACTAGT CGTGTTGGTA TTGAAGTTCC GAACCAAAAT 4 02 0 

CCAACGACAG TCAACTTACG TTCTATTATT GAATCTCCaA GTTTTAAAAA TGCTGAATCT 4080 

AAATTAACAG TTGCGATGGG GTATAGAATT AATAATGAAC CATTACTTAT GGATATTGCT 414 0 

AAAACGCCAC ACGCACTAAT TGCAGGTGCA ACTGGATCAG GGAAATCAGT TTGTATCAAT 4 200 

AGTATTTTGA TGTCTTTACT ATATAAAAAT CATCCTGAGG AATTAAGATT ATTACTTATC 42 6 0 

GATCCAAAAA TGGTTGAATT AGCTCCTTAT AATGGTTTGC CACATTTAGT TGCACCGGTA 4 32 0 

^ ATTACAGATG TCAAAGCAGC TACACAGAGT TTAAAATGGG CCGTAGAAGA AATGGAACGA 4 3 80 

CGTTATAAGT TATTTGCACA TTACCCATGT ACGTAnTATA ACAGCATTTA ACnAAAAAGC 44 4 0 

CCCATATGAT GAAAGAATGn CAAAAATTGT CATTGTAaTT GATGAGTTGG CTGATTTAAT 4 5 00 

2S GATGATGGTC CGCAAGAAGT TG 4S22 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 751 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDSDNESS : double 
CD) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

TCAAGTTTAC GGATACGTAT ATATTTTGCA TGACATTTAG TGCAATAATA TTCATAATTT 6 0 

GCCCGTTGTT GATAGCTTTC AATG CTGTTA CAAAATCTAG GCGCTCCAAC CTGTTGGCTC 12 0 

40 

AATCGTTTAA AATCTTGATC TTTATGTTGA TAACCTTTAC CAGCAATATG CAAGTGATAA 18 0 

TGACACAATT CGTGCAGTAT AATTTTTACA ACAGCATCTT CTC CAT AATG CTCATATTGT 24 0 

TTTGGATTAA TTTCAATATC ATGGGACTTT AAAAGATAAC GTCCGCCTGT TGTACGTAAC 30 0 

CTTTTATTAA AATATGCACA ATGTCGAAAC GTACGTCCAA ATTTTTCTTC CGAAAGATTC 3 60 

TCAACCATTC GCTGAAGTTT GTCATTATTC ATGTGGATCA ATCATCGTTA ATGATACTTT 42 0 

GTCTTTATTT TTGTCAATAC TGTAAATCCA AACGTCAACG ATATCACCAA CACTGACAAT 4 80 

ATCCATTGGA TTTTTTACGA ACTTCTTAGA AAGTTTCGAA ACATGGACAA GTCCATCTTG 54 0 
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TTTCATTCCT TCTTGTAAAT CTTCAATTGA TAGCACATCG GATTTAAGGA TTGGTGTTTC g 60 

AAACTCGTCC CTTGG ATCT C GATTAGGTGC GTTCAAGGAT TTAATAATAT CCTCTAATGT 7 2 Q 

5 

AGGTACACCG ACTTGTAATT CAATCGCCAG T 751 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1076 base pairs 

(B ) TYPE : nucleic acid 

(C) STRANDEDNESS ; double 

(D) TOPOLOGY: linear 

ts 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

TCTCCAGCTT TAACTTGATC TGGCACTTTA ACAATTGTCT GATCCATACA TACGCGACCA 6 0 

20 ATAACTTCGC AITGATGACC ATTTACATTT ACAAAGCTAC CTTGCATTAT GCGTAAATGG 12 0 

CCATCTGCAT ATCCAATAgG TAACAATGCT ATTGTAGTTG GGTCAGTAGC TGTATAAGTT 180 

GCACCATAAC TTACAGACTC ACCCGCTTGT AGCGTCTTTG TTTGAACTAC ATTAGCAATT 24 0 

25 AATTGCACAC TTGGTTTAAG GTGTACTTTA ACTTTTTGCT GTACATACTC TGATGGATAA 3 00 

TATCCATAAA GGGAAATTCC TGGTCTTATT GCATTACAGA ATTGGCAATC CATTAATAGA 3 60 

GAGCCTGCTG AGTTCTGACA ATG TAT AT AT TCAGGTTTAA TTGCTTCATT GA C CAT AT CT 4 20 

TTAAAACGTT GATATTGTTC AGTTGTCATA TCTCCTGGTT CGTCAGCACA GGCAAAGTGT 4 80 

GTAAACACGC CTTCAAATAC AAGTTGCTCA TATTGTTGAA TGATTTCAAT CACTTCTTGA 54 0 

TACGTTTTAG TATCTTTAAT ACCTAAACGT CCCATTCCTG TATCTAATTT AATGTGCAAC 600 

CATAACTTTT TCTCTTGCTC ACCAGAAATG TTTTTAATTG CTTCTTTCAA CCACTGTTTA 660 

GACGGAACCG TTAAGGCAAC TCGGTGTTGT ATCGCTTTAT CAATATCTTT AGCTGGTAAC 720 

ACACCTAAGA CTAAAATTTT AGCAGTAATC CCATGCATTC TAAGTTCTAT CGCTTCATCT 780 

AACGTTGCTA CAGCAAAAAA TGTGGCGCCA TTTTCCATTA AATGACGTGC TACTTTAACA 84 0 

CTACCTAGTC CATAGGCATT GGCTTTAACG ACAGCCATCA CTGTTTTATT TGGATGCAAT 90 0 

GTACTGAATA CTTTGAAATT TGATGCAACA GCGTTTAAAT CTACATTCAT ATACGCAGAT 960 

CTATAATATT TATCCGACAT ATTACTTCCT CCTGTAATTC CCACACGTTT TAAAACTAGA 102 0 

TCTTAATTAT CATTGTATAA CAAATTTAAA ATGCTGACTT TTCTAAAACA ACTTGG 1076 

5Q (2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2930 base pairs 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



[Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGACCACAAT GCCCAATACA ACCATCCCAT GGTAAAGCCA AGAGATGAGT CAATAAAGCG 6 0 

TGTTGAATAA GAGCTGAATG AACCTGATAC TGGATAAAAT GTTGCCAACT CTCCAATTGA 120 

TGACATTAAG AAATATAGCA TGACACCAAT AACAAGATAA GCGAGTATAG CGCCTCCAGG 180 

ACCAG CTTGA GAAATGATAT TACCAGTAGC TACAAATAGA CCAGTCCCAA 7TGCACCACC 24 0 

TATAGCAATC ATGGAAATGT GTCTTGAGTT AAGACTACGG TTCATTTTAT TATCTTCCAT 3 00 

ATTTAGTCTC CCATCTATTT AAATATACCC ATTATTGTAA GCTTTTTAAG TGTACTATTC 3 60 

AATAACTATT TAGTACTGTA AAGCGAAAAA ATTAAAATTT TCTGATTTTT TAATCATCTT 4 20 

GAGCATGTTT AATTGTAATT TTGATGGGGT TAAATTATAA TATGTATTAA ATTATAATTA 4 80 

TnATAAATTG TGGAGGGaTG ACTATGTCAC AACAAGACAA AAAGTTAACT GGTGTTTTTG 540 

GGCATCCAGT ATCAGACCGA GAAAATAGTA TGACAGCAGG GCCTAGGGGA CCTCTTTTAA 6 00 

TGCAAGATAT TTACTTTTTA GAG CAAATGT CTCAATTTGA TAGAGAAGTA ATACCAGAAC 6 60 

GTCGAATGCA TGCCAAAGGT TCTGGTGCAT TTGGGACATT TACTGTAACT AAAGATATAA 72 0 

CAAAATATAC GAATG CTAAA At ATTCTCTG AAATAGGTAA GCAAACCGAA ATGTTTGCCC 78 0 

GTTTCTCTAC TGTAGCAGGA GAACGTGGTG CTGCTGATGC GGAcGTGACA TTCGAGGATT 84 0 

TGCGTTAAAG TTCTACACTG AAGAAGGGAA CTGGGaTTTA GTAGGGAATA ACACACCaGT 90 0 

ATTCTTCTTT AGAGATCCAA AGTTATTTGT TAGTTTAAAT CGTGCGGTGA AACGAGATCC 96 0 

TAGAACAAAT ATGAGAGATG CACAAAATAA CTGGGATTTC TGGaCGGGTt TCCAGAAGCA 102 0 

TTGCACCAAG TAACGATCTT AATGTCAGAT AGAGGGATTC CTAAAGATTT ACGTCATATG 108 0 

CATGGGTTCG GTTCTCACAC ATACTCTATG TATAATGA7T CTGGTGAACG TGTTTGGGTT 114 0 

AAATTCCATT TTAGAACGCA ACAAGGTATT GAAAACTTAA CTGATGAAGA AGCTGCTGAA 12 0 0 

ATT AT AG CT A CAGATCGTGA TTCATCTCAA CGCGATTTAT TCGAAGCCAT TGAAAAAGGT 1260 

GATTATCCAA AATGGACAAT GTATATTCAA GTAATGACTG AGGAACAAGC TAAAAACCAT 132 0 

AAAGATAATC CATTTGATTT AACAAAAGTA TGGTATCACG ATGAGTATCC TCTAATTGAA 1380 

GTTGGAGAGT TTGAATTAAA TAGAAATCCA GATAATTACT TTATGGATGT TGAACAAGCT 144 0 

GCGTTTGCAC CAACTAATAT TATTCCAGGA TTAGATTTTT CTCCAGACAA AATGCTGCAA 1500 

GGGCGTTTAT TCTCATATGG CGATGCGCAA AGATATCGAT TAGGAGTTAA TCATTGGCAG 156 0 
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GGTCAAATGC GCGTAGTTGA CAATAACCAA GGTGGAGGAA CACATTATTA TCCAAATAAC 16 8 0 

CATGGTAAAT TTGATTCTCA ACCTGAATAT AAAAAGCCAC CATTCCCAAC TGATGGATAC 174 0 

GGCTATGAAT ATAATCAACG TCAAGATGAT GATAATTATT TTGAACAACC AGGTAAATTG 180 0 

TTTAGATTAC AATCAGAGGA CGCTAAAGAA AGAATTTTTA CAAATACAGC AAATGCAATG I8 60 

GAAGGCGTAA CGGATGATGT TAAACGACGT CATATTCG7C ATTGTTACAA AGCTGACCCA 192 0 

GAATATGGTA AAGGTGTTGC AAAAGCATTA GGTATTGATA TAAATTCTAT TGATCTTGAA 198 0 

ACTGAAAATG ATGAAACATA CGAAAACTTT GAAAAATAAA TTTGATATGT AGTTTCTATA 2 04 0 

TTGCGTAGTT GAGCAGTTTA TGATATCATA ATAAATCGTA AAGATTCCTA ACAAGAGAGG 2100 

GTGTTTAACG TGCGCGTAAA CGTAACATTA GCATGCACAG AATGTGGCGA TCGTAACTAT 216 0 

ATCACTACTA AAAATAAACG TAATAATCCT GAGCGTATTG AAATGAAAAA ATATTGCCCA 222 0 

AGATTAAACA AATATACGTT ACATCGTGAA ACTAAGTAAT TCTTATCATT CAAATACGAC 22 8 0 

GATTTGAAAA TAAAGCGGGC TTACCTATTA TATTGGGGAG CTCGCTTTTT TATGAAATTT 234 0 

TTGTGAAGAG TGATTAATGG ATTGAGTTTC ATCGGTAGAA CAATATATGA TTATATTAGT 2 4 00 

TGTTACTTTA TTAAAaTTTG AGAATATTTA TAGAAGGAAA TAGATTACTG ATTTTATAAA 24 60 

GTCACTTTGT TAGCGAATGC TTGAAAGAGT ATTTAATATA GTAGAATTTA AAATTTCAAA 2520 

G CGGAATTTA ATAAGTACGA AGTAGTTCTG GGTATGTTTT ATAAATGTTC GATAATACAC 25 BO 

TTTAATCTTA AATATGATGG TTTAGAAAAT GATTTAACAA AGAAATGAaA CTTTACTGTT 264 0 

GAATTATGTG AGGATTGTGT TATTATATAA ATCGTAATAA TTACGATTTG ATAAAAAGTG 2700 

AGGTAACTAT ATATGGCTAA G AAAT CT AAA ATAGCAAAA3 AGAGAAAAAG AGAAGAGTTA 2 76 0 

35 GTAAATAAAT ATTACGAATT ACGTAAAGAG TTAAAAGCAA AAGGTGATTA CGAAGCGTTA 2820 

AGAAAATTAC CAAGAGATTC ATCACCTACA CGTTTAACTA GAAGATGTAA AGTAACTGGA 2 88 0 

AGACCTAGAG GTGTATTACG TAAATTTGAA ATGTCTCGTA TTGCGTTTAG 2 93 0 
40 (2} INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3606 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
45 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
CTTCTTGCCA TGGCTCTCTT TATTTAAAAA TGCTTCCAAC TTGTCCATTT GATTGTTTCT 6 0 
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TTATAAAAAA CTAATTTTAC AAATGCTTTT GCGTTCTTAC AAAAAATGCA TTTGACTATT 190 

ATTATAATAA GCGTATAATT GTCGCATATT ATTTTTTGTA TTTTTGGCAA TAACGAAGGA 24 0 

5 GTATTTATGA ATAAAGACAA GCAATTGCAC AACGACAAAA TCAATCTATC CCAATTAGTC 3 00 

TTATTAGGGT TAGGCTCTTT AATAGGATCT GGTTGGCTAT TTGGTGCGTG GGAAGCATCA 3 60 

TCAATAGCTG GACCAGCAGC AAT CAT AT CA TGGGTTCTTG GATTCCTAGT CATTGGAACC 420 

10 ATTGCCTATA ACTACATTGA AATCGGCACA ATGTTTCCTC AATCAGGTGG CATGAGTAAC 4 80 

TATGCCCAGT ATACACATGG CTCATTATTA GGCTTTATTG CTGCTTGGGC GAATTGGGTG 54 0 

TCTTTGGTGA CAATAATACC TATCGAAGCT GTGTCAGCTG TTCAATATAT GAGTTCTTGG 6 00 

15 

CCGTGGCATT GGGCGAAACC AATGAGATAT TTAATGGAAA ATGGCT CT AT TAG CA CAT AC 66 0 

GGATTGCTAG CTGTATATCT CATCATTGTT ATTTTTTCAT TATTAAACTA TTGGTCCGTA 72 0 

AAACTTTTAA CATCATTTAC GAGTTTAATT TCTGTATTTA AATTAGGCGT ACCCATGTTA 780 

20 

ACCATCATCA TGTTGATGCT ATCAGGATTC GACACTTCAA ATTACGGCCA TTCGGCAAGC 84 0 

ACATTTATGC CTTACGGAAG TGCACCGATT TTTGCTGCAA CAACAGCATC AGGGATTATT 90 0 

TTTTCATTCA ATTCATTCCA GACAATTATT AATATGGGTT CAGAAATTAA AAATC CTGAA 96 0 

25 

AAAAATATCG CAAGAGGCAT CGCTATCTCA CTGTCAATCA GTGCAGTGTT GTACATCATT 102 0 

TTACAAAGTA CGTTTATCAC TTCTATGCCT CAATCAATGT TACAACATAG TGGATGGAAT 1080 

GGCATCAACT TCAATTCACC ATTTGCTGAT TTAGCTATCT TATTAGGAAT TAATTGGCTC 114 0 

30 

GCAATTTTAC TATACATTGA AG CTTTTGTA TCACCATTCG GTACTGGCGT GTCATTTGTC 12 00 

GCCGTTACAG GTCGAGTTTT ACGAG CAATG GAGAAAAATG GACATATCCC TAAATTTCTT 126 0 

35 GGGAAGATGA ATGAAAAATA TCATATCCCA CGTGTAGCAA TCATCTTTAA TGCCATCATT 132 0 

AGTATGATTA TGGTTACATT ATTTAGAGAT TGGGGTACGC TAGCAGCAGT TATTTCTACT 13 8 0 

GCAACTTTAG TAGCCTAT7T AACTGGCCCA ACGACAGTGA TTGCATTAAG AAAAATGGGA 144 0 

40 CCAACAATGA CTCGTCCATT TAGAGCAAAA ATTTTAAAAG TAATGGCACC ATTATCATTT 15 00 

GTATTAGCTT CATTAGCTAT ATATTGGGCA ATGTGGCCAA CAACGGCTGA AGTTATTTTA 156 0 

ATCATTATAC TTGGATTACC AATCTACTTC TTCTATGAAT ATCGTATGAA TTGGCGT AAT 162 0 

45 ACAAAGAAAC AAATTGGTGG TAG CTT ATGG ATTATTGTAT ATTTAATCGT GCTATCAATA 16 3 0 

CTGTCATTTA TAGGAAGCAA AGAATTTAAA GGCTTAAATA TGATTCACTA TCCATTTGAC 174 0 

TTTATCGTTA TTATTATTGT GGCACTTATC TTCTATTACA TCGGTACAAC GAGTTCATTT 1300 

50 

GAAAGCGTCT ATTTCCGTCG CGCAACACGA ATCAATACGA AGATGCGTGA GTCACTAAAT 186 0 

55 



359 



EP 0 786 519 A2 





CACACACATT 


AACCAACCAT 


TGATTTCAAC 


ATCTTGGTTG 


GTTTTTTATT 


TTGAAAATCG 


1980 




GTTATAAATA 


ACTAACATAA 


CAAGATGATG 


ATCAGGCTGG 


GACATAAATC 


AATGTTCTAT 


2040 


5 


GCTCTACGAA 


gTTATATTGG 


CAGTAGTTGA 


CTGAACGAAA 


ATGCGCTTGT 


AACAAGCTTT 


2100 




TTTCGATTCT 


AGTCAGGGGC 


CCCAACACAG 


AGAATTTCGA 


AAAGAAATTC 


TACAGGCAAT 


2160 




GCAAGTTGGG 


GTGGGACGAC 


GATAAAGAAA 


TACTTTTTCT 


ATAGAAATTA 


GTATytCTTA 


2220 


10 


TGCATGAGTT 


TTACTCATGT 


ATT CAT A TIT 


TTAAGTACAC 


ATTAGCTGTG 


GCTAATGTAT 


2280 




AAGAACCACT 


A CAT AAT AAA 


TCATTTGTGG 


CTCTTTATCA 


TTTCTGTCCC 


ACTCCCGTAG 


2340 




AAGTACATCA 


TATAATGCTG 


AAAATGGTTT 


GAGTTAAAAC 


AGATATCAAG 


CTCGTCTGAT 


2400 


15 


TCAGTCACAA 


AATTGTCTTG 


TTATACTTGT 


CACCTATCAT 


CTATAGACCG 


TGGTATGATT 


2460 




AAATTGGGGA 


TGATAAAGGA 


GGTTAATAAA 


TATGAAGATT 


AATACTACAG 


GTGGTCAAAT 


2520 


20 


TCATGGTATT 


ACACAAGATG 


GTTTAGATAT 


CTTCTTAGGC 


ATTCCTTATG 


CAGAACCACC 


2580 


AGTTCATGAC 


AATCGCTTTA 


AACATTCTAC 


GTTAAAAACA 


CAATGGTCAG 


AGCCAATTGA 


2640 




TGCAACTGAA 


ATACAACCCA 


TCCCACCGCA 


ACCAGACAAC 


AAATTAGAAG 


ATTTTTTCTC 


2700 


25 


CTCACAATCT 


ACAACTTTTA 


CTGAACATGA 


AGACTGTTTA 


TATCTAAATA 


TTTGGAAACA 


2760 


ACATAATGAT 


CAGACGAAGA 


AACCTGTCAT 


CATTTATTTT 


TATGGTGGTA 


GTTTTGAAAA 


2820 




TGGTCATGGT 


ACAGCCGAAC 


TCTATCAACC 


GGCACATTTA 


GTACAAAATA 


ACGACATTAT 


2880 


30 


CGTTATTACA 


TGCAATTATC 


GTTTAGGCGC 


ATTAGGATAT 


TTAGACTGGT 


CATATTTTAA 


2940 




TAAAGATTTT 


CATTCCAATA 


ATGGCCTTTC 


AGATCAAATC 


AATGTCATAA 


AATGGGTGCA 


3000 




TCAATTTATT 


GAATCCTTCG 


GTGGCGACGC 


TAATAACATT 


ACTTTAATGG 


GTCAGTCTGC 


3060 


35 


AGGCAGTATG 


AGCATTTTGA 


CTTTACTTAA 


AATACCTGAC 


ATTGAGCCAT 


ACTTCCATAA 


3120 




AGTQGTTCTA CTAAGTGGCG 


CACTACGATT 


AGACACCCTT 


GAGAGTGCAC 


GCAATAAAGC 


3180 




ACAACATTTC 


CAAAAAATGA 


TGCTCGATTA 


TTTAGATACA 


GATGATGTTA 


CATCATTATC 


3240 


40 


GACAAATGAT 


ATTCTTATGC 


TGATGGCGAA 


gcTAAAACAA TCTCGAGGAC 


CTTCTAAAGG 


3300 




GCTTGATTTA 


ATATATGCGC 


CTATTAAAAC 


AGATTATATA 


CAAAATAATT 


ATCCAACAAC 


3360 




GAAACCAATT 


TTTGCATGTT 


ATACAAAAGA 


TGAAGGCGAT 


ATTTATATTA 


CTAGTGAACA 


3420 


45 


GAAAAAATTA 


TCGCCGCAAC 


GCTTTATCGA 


CATTATGGAA 


TTAAATGATA 


TTCCTTTAAA 


3480 




ATACGAAGAT 


GTTCAGACGG 


CGAAGcAACA 


ATCTTTAGCG 


ATTACACATT 


GTTATTTCaA 


3 54 0 




ACAGCCGATG 


aAGCAATTTT 


TACmACmACT 


CAATATACmA 


GATTCCAACC 


GCACCAACTA 


3600 


50 


TGGCTT 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15109 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
O) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



IS 



20 



25 



30 



35 



GAAATTAAAA AAGCAATTGG 


nACAAGATGC 


AACAGTGTCA 


TTGTTTGATG 


AATTTGATAA 


60 


AAAATTATAC 


ACTTACGGCG 


ATAACTGGGG 


TCGTGGTGGA 


GAAGTATTAT 


ATCAAGCATT 


120 


TGGTTTGAAA 


ATGCAACsAG 


AACAACAAAA 


GTTAACTGCA 


AAAGCAGGTT 


GGGCTGAAGT 


180 


GAAACAAGAA 


GAAATTGAAA 


AATATGCTGG 


TGATTACATT 


GTGAGTACAA 


GTGAAGGTAA 


240 


ACCTACACCA 


GGATACGAAT 


CAACAAACAT 


GTGGaAGAAT 


TTGAAAGCTA 


CTAAAGAAGG 


300 


ACATATTGTT 


AAAGTTGATG 


CTGGTACATA 


CTGGTACAAC 


GATCCTTATA 


CATTAGATTT 


360 


CATGCGTAAA 


GATTTAAAAG 


AmAAATTAAT 


TAAAGCTGCA 


AAATAATTCA 


GCTATATAAG 


420 


TTAGTGAAAT 


GAGAGTCTGA 


AACATATCAA 


TCTTTTGATA 


TTGTATTAGG 


CTCTTATTTT 


480 


TATAG CT AG A 


AAGTTAGATA 


TTTGTATTTT 


TTTAAATAAT 


AAGTGCCGTT 


GTTATCGTTC 


540 


AATTTAATTA 


ATGATAGATT 


AGTATTATTA 


TAGCTAAAGT 


AGTATACCTG 


AGAAAATAGC 


600 


TCAATGTATC 


TCTTTATTAA 


TAAGTTATAT 


CATAATTATT 


TTAGTGCATA 


CTTTATGGAA 


660 


GGGATATCAG 


GGAATGGCTT 


TCAATTAAAG 


AAGAGGTTTA AAAGGATTAC 


AACAGAATGT 


720 


TATGATTTTG 


TAGAAAGATA 


TATAACAACG 


TTTTATAAAA 


ACATAATATT 


GTTAATGGAA 


780 


AATGAAATGT 


AAGGGGGATT 


TCGAGTGACT 


AAGAAAGTTT 


ATTTTAACCA 


CGATGGTGGT 


840 


GTAGATGATT 


TAGTATCTCT 


ATTTTTATTA 


TTACAAATGG 


AAAACGTTCA 


ATTGATAGGG 


900 


GTCAGTACAA 


TTGGTGCTGA 


TTGTTATTTA 


GAGCCATCTT 


TGAGCGCATC 


AGTAAAAATT 


960 


ATTAATCGTT 


T7TCAAATGA 


AGATATTCAA 


GTTGCGCCAT 


CATATGAACG 


AGGAAAAAAT 


1020 


CCATTTCCTA 


AAGAATGGCG 


TATGCATGCC 


TTTTTTATGG 


ACGCATTGCC 


AATTTTAAAT 


1080 


GAGCCAGTCA 


AACATGTTGC 


TTCAAATGTG 


AGCGACAAAG 


AAGCCTTTGA 


AGACATTATT 


1140 


CAAACTTTAA 


AGAGACAATC 


AGAAAAAGTA 


ACATTATTAT 


TTACAGGCCC 


GCTTACAGAT 


1200 


TTAGCAAAAG 


CACTACAAAA 


AG ATTCAT CT 


ATCGTTCAGT 


ATATAGAAAA 


ATTAGTTTGG 


1260 


ATGGGTGGCA 


ccittttacc 


AAAAGGAAAT 


GTTGAAGAAC 


CTGAGCATGA 


TGGTTCTGCA 


1320 


GAATGGAATG 


CA7ATTGGGA 


TCCAGAAGCG 


GTTAAAATTG 


TTTTTGATAG 


CGATATAGAG 


1380 


ATTGATATGG 


TTGCTTTAGA 


AAGTACGAAT 


CAAGTACCGC 


TAACGTTAGA 


TGTTAGACAA 


1440 
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5 



W 



15 



25 



30 



35 



40 



GTACCACCAT 


TAACACACTT 


TATAACAAAT 


TCTACTTACT 


TTTTATGGGA 


TGTTTTAACG 


1560 


ACTGCTTATA 


TTGGTAACAA 


GGACTTGGTT 


CATTCAATTG 


AGAAAAAAGT 


CGATGTAATA 


1620 


AGTTATGGAC 


CAAGTCAAGG 


TAAGACATTT 


GAGTGTAAAG 


ATGGGCGCAA 


AATTAATGTC 


1680 


ATAAATCATG 


TAGATAACAA 


CGCATTTTTT 


GATTATATAA 


CTGCACTTGC 


TAAAAAAGTA 


1740 


AATTAACAGC 


TGTGTAGAAT 


AATTAAGGTT 


TTAATTTATA 


TAGAACAACT 


TATTGTAAAC 


1800 


TTTTCATTTC 


TTAAAGTTTA 


CAATGGTGCT 


ATAATAATGG 


TCATGAAATA 


CGAAAGGAAG 


1860 


TAAAAAATGA 


CAACAAAACA 


GTTAGTATAT 


ACAGCTTTAA 


TGACAGCGAT 


TATCGCTATT 


1920 


TTAGGATTGG 


TACCGGTAAT 


TCCACTACCA 


TTTTCTTCAG 


TACCAATTGT 


ACTTCAAAAC 


1980 


ATTGGTATTT 


TCTTAGCAGG 


TGCGATTTTA 


GGACGTAAAT 


ATGGCACATT 


AAGTGTTATC 


2040 


GTCTTTTTAT 


TATTAGTAGT 


TGCTGGCTTG 


CCATTGTTAT 


CAGGTGGTCG 


CGGTGGCATC 


2100 


GGTGTATTCG 


CAGGTCCTTC 


AGCAGGGTTT 


TTACTATTAT 


ATCCAGTTGT 


AGCATTCATG 


2160 


ATTGGGGCGA 


TTCGAGATAG 


ATTCATCAAT 


GAAATTAATT 


TCTGGATTTT 


ATTCGTTGGT 


2220 


ATTTTAGTTT 


TTGGTGTTAT 


AG CATTAGAT 


GTTATTGGTA 


CATTGATTAT 


GGGCATGATT 


2280 


ATTAACATAC 


CATTTACGAA 


AGCTATTTCA 


ATTTCATTAG 


CTTATTTGCC 


TGGTGATATA 


2340 


TTAAAAGCAA 


TTGTAGCAAG 


TTTGATTGGT 


ACAGCTTTAC 


TTAATCACTC 


GCAGTTTCGT 


2400 


CAAATTATGG 


GAATAAAATA 


ATCATATTTA 


AGATAGTAAA 


GTAATTGAAT 


AAGTTGCTTT 


2460 


GAAATTTATA 


AAAGTGAAAG 


GAGTAGGTGT 


CAATGGCTAG 


TATAAGTATG 


TCAGATATAT 


2520 


ATTGTAACGG 


CACTATATTT 


GAAAATGACG 


ACGAGCAGTT 


GATTTATTTA 


ACGCCTTCTT 


2580 


TTCCACAACG 


ATACACAAGT 


AACACATGGA 


TATATAAAAA 


GACGCCTACC 


CAAGAGCGAT 


2640 


GGCTGAAAGA 


CTTAGAACGT 


CAACATCAAT 


TACATACAAA 


TCAAGGTTCA 


AATCATTATG 


2700 


CGTTTAGTTT 


CCCGGAAAAT 


GAACAACTTG 


ATAATCATTG 


GATGGCTATG 


TTTAAAGATA 


2760 


TGAATTTTGA 


ACTAGGTATT 


ATGGAATTGT 


ATGCCATAGA 


AAGTGATGCG 


CTTGCCAATT 


2820 


TGCCGCGTAA 


CTCTGACGTT 


GAAATTGCCA 


TCGTTGACGA 


GTCGCATATA 


GATGCCTATT 


2880 


TAAAAGTTGC 


ATATCAGTTT 


AGTTTG C CAT 


TTGGAAAAGA 


CTATGCAGAT 


GCACATGAAG 


2940 


AAATGGTAAG 


GGAACATTAT 


CAAAAAGATG 


TGATTAAACG 


CTTAGTAGCT 


TATTTAAATA 


3000 


ATGAACCTAT 


TGGCGTTGTA 


GATGTCATTG 


AAAGTGAAAA 


TTACATTGAA 


TTAGATGGAT 


3060 


TTGGTGTATT 


AGAACAATTT 


CGGCACCAAG 


GAATTGGATC 


TACAATTCAA 


TCGTTGATAG 


3120 


GTGAATACGC 


CATATCAAAA 


AATCACAAAC 


CAATCATATT 


AGTTGCAGAT 


GGTGAAGATA 


3180 


CAGCAAAAGA 


TATGTATGCA 


AAGCAAGGTT 


ATGTCTATCA 


ATCGTTTTGT 


TATCAAATAT 


3240 
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TAAGCTGGTT TCGAGTAGAA ATCAACTTAC TGCTTTTTAA ATTGTTTTCA GCTACTTATA 3 36 0 

CTTATAAAAA TAGTGCGTTT AAATTGTTGA TTCATGTAGA ATATCGTTCA TTATGACACA 342 0 

CTATAATGAA TATGTTATTG TTCAGAATCA ATGATACGTT CTGGATGACT GTATATATTA 34 8 0 

AAGCCACCAT TTCGAATAAA TCCAACTGCC GTAATATTTA GGTCATTAGC TAAGGTTACA 3 54 0 

GCAAGCGTTG TCGGAGCTGA TTTAGATAAA ATGACGCCAA CACCAATTTT TGCGGCTTTA 3 600 

w 

ATTAAAATTT CTGATGAAAT ACGTCCACTA AAAATTAATA CTTTATCTCG GACAGTAATA 3 66 0 

TGTCGCTGAA TACAAAATCC ATATAATTTA TCTAGAGCGT TATGTCTACC AATGT CTTG T 3 72 0 

CGATGTACAA AAAATGTCAA ACCATCGCTT ATAGCAGCAT TATGTAAGCC ACCTGTTTCT 378 0 

TGGTAAATAT GACTTGCACT TTGTAATCGA GTCATCATGT TAATAATTTG C ATT GG AG TT 3840 

AAAGTGATTT TAGACATAGA TGTTTTAGCG ATAGCAGCAT CATTTTGAAA ATAAAACTCA 3 900 

20 CGACTCTTTC CGCAACAAGA TGCAATCATT CGTTTTGTGG AATATTGAAA GCGATCGCCT 3 96 0 

AAATCTTTAT TAAGTTCAAC ATGGGCAAAA CCTTTACTAT CATCAATCAG TACAGATTTT 4 02 0 

AATTCATCTC GCTTTAAAAT GGCACCTTCC GAAGCCAGAA ATCCAATGAC TAACTCCTCA 4 0 80 

25 AGGTTTGTTG G A C TG CAT AT AACAGTCGCA AATTCTTCAC CATTCACCAT AATTGTAAGT 414 0 

GGAAATTCTG TCACATATTG ATCTGTTGTA TTGAATAATT TTCCATCTTC ATATCTAACA 4200 

ATTGGTTGAC CTAAAGATAC ATCTTTGTTC ATTATCTAAC CCCTTTAATT AG CTTAAACT 4 2 60 

30 

TTATTTTAAA GCAATTTGCT TAAAATTTTA ACATATTTGC TTAAGTTTGA AATTTGATTG 4 3 20 

ATAAAAATTA ATAGCGAGCA ATCTGTTTGA TTTAAATTGA ATTCGAGAAT ATACATACTA 4 3 80 

GGGCATCAAT TAATAAATAT CAATCTTATG CAAATTTGAC AATTGTTTGA ATCAATATAT 44 4 0 

35 

AAACAGGCAA CGGTTCTTTT CAAATATAAT AGTAAGTGTA TAATGAAAAT GTAAATATTA 4 5 00 

TTAAAAATGG GGGTTCACTC AATGAAATTG AAACGTTTAT TTGCTGTTGT GATTGCAATG 4 560 

4Q CTTTTAGTAT TAGCTGGTTG CTCTAATTCT AACGATAATA ATGAAAGTAA AAAAGATGAC 4 620 

GCAGACAATG GTAAGAAACA AGAGATTCAA GTTGCAGCGG CAGCAAGTTT AACAGATGTA 4 680 

ACCAAGAAAT TAGCTTCAGA ATTTAAAAAA GAGCATAAAA ATGCTGATAT TAAATTTAAC 4 74 0 

■*S TATGGTGGAT CAGGGGCATT AAGAAAACAA ATTGAATCAG GCGCACCTGT TGACGTATTT 4 800 

ATGTCTGCAA ATACTAAAGA TGTAGATGCA TTAAAAGACA AGAATAAAGC GCATGATACA 4 360 

TATAAATATG CGAAAAATAG TCTAGTATTA ATTGGTGATA AAGATTCAAA TTACACTTCA 4 92 0 

50 

GTAAAAGACT TAAAAGACAA TGATAAATTA GCATTAGGTG AAGTGAAAAC TGTACCAGCA 4 98 0 

GGAAAATATG CGAAACAGTA TTTAGATAAC AATAACTTAT TTAAAGAAGT CGAAAGTAAA 5 04 0 
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CAAGGTTTTG TGTA7AAAAC TGACTTATAT AAACAAAATA AAAAAATTGA TACTGTAAAA 
GTAATTAAAG AAGTAGAACT TAAGAAG CCA ATCACATACG AAGCTGGTGC TACATCAGAT 

5 AGTAAATTAG CAAAAGAGTG GATGGAATTC TTAAAATCAG ATAAAGCTAA AGAAATACTA 

AAAGAATACC ACTTTGCAGC ATAAGGAGTT GTAATCCATG CCTGACTTAA CACCTTTTTG 5 34 0 

GATATCAATA CGAGTTGCTG TAATCAGTAC GATTATTGTA ACGGTTTTAG GTATTTTTAT 54 00 

'° ATCTAAATGG TTGTATCGTC GTAAGGGTTC GTGGGTTAAA GTATTGGAAA GTTTATTGAT 54 60 

ATTACCTATT GTTTTGCCGC CAACGGTATT AGGTTTTATT CTATTAATCA TCTTCTCGCC 5520 

AAGAGGACCA ATCGGTCAAT TCTTTGCGAA TGTACTACAT TTACCTGTAG TGTTCACTTT 5 5 80 

15 

GACAGGTGCT GTGATAGCAT CTGTCATTGT TAGTTTTCCA CTAATGTATC AACATACTGT 564 0 

GCAAGGCTTC AGAGGTATAG ACACGAAAAT GATTAATACA GCTAGAACGA TGGGAGCAAG 57 00 

20 TGAAACGAAA ATTTTCCTCA AATTAATTTT ACCATTAGCT AAACGCTCTA TTTTAGCAGG 576 0 

TATAATGATG AGTTTTGCTC GTGCATTAGG TGAGTTT GGT GCTACATTAA TGGTTGCAGG 5820 

ATATATTCCA AATAAAACGA ATACACTACC TTTAGAAATA TACTTCTTAG TGGAACAAGG 5880 

25 TAGAGAAAAT GAAGCGTGGT TATGGGTATT AGTGCTAGTC GCATTCTCTA TTGTGGTTAT 594 0 

ATCTACAATT AATTTATTGA ATAAAGATAA ATATAAGGAG GTCGACTAGA TGCTTAAAAT 6 000 

CAATGTGAAA TATCAATTAA AGAACACTTT AATTCGCATC AATATAGATG ATACTGAACC 6 060 

AAAAATTTAT GCAGTTCGTG GTCCATCTGG CATTGG T AAA ACTACTGTTT TAAATATGAT 6120 

TGCCGGATTA CGTAAAGCAG ATGAAGCTAT TATCGAAGTG AATGGGCAAT TACTTACTGA 6180 

TACGGCAAAA AACGTGAATG TTAAAATTCA ACAACGACGT ATTGGATATC TGTTTCAAGA 6240 

CTACCAATTG TTTCCTAATA TGACGGTCTA TAAAAATATT ACTTTTATGG CTGAACCATC 63 00 

TGAACACATC GATCAATTAA TTCAAACTTT AAACATTGAT CATTTGATGA AACAATATCC 6 3 60 

TATGACATTG TCAGGTGGAG AGGCACAACG TGTAGCACTT GCACGTGCAC TTAGCACrAA 6420 

ACCAGATTTA ATTTTATTAG ATGAACCTTT TTCTAGTTTG GATGATACTA CAAAAGATGA 64 80 

GAGTATTACA TTAGTTAAAC GTATTTTCAA CGAATGGCAA ATACCAATCA TATTTGTGAC 654 0 

45 ACATTCAAAC TATGAAGCAG AACAAATGGC TCATGAAATT ATTACAATTG GGTAATCATT 6 600 

TATTTGCCAT TAAAGAGTTT AGAACGTATT TAAAATTGTA GAAGTGAATG CTTCTATCAG 666 0 

CATTTTAATG ATGTTTTAAA CTCTTTTTTA GGGGCAGTTT TTTTGAGAGA CATTGACGCG 672 0 

SO CGTCATATAA TGAAAGTAAT GATAAAAAGA AAGGATAACT TAATGTGAGT CAAGAACGTT 6 78 0 

ATTCAAGGCA AATTTTATTT AAACAAATAG GTGAAATAGG TCAAAGCAAA AT AAA TC AAA 6 84 0 
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GAGCAGGCAT TGCCAAACTA ATCATTGTTG ATAGAGATTA TATTGAATTT AGTAATTTAC 6 960 

AAAGACAAAC ATTGTTTACT GAAGAAGATG CTTTGAAAAT GATGCCTAAG GTGGTTGCAG 7020 

CTAAAAAGCA TTTGCTAGCG TTACGTAGTG ATGTTGATAT TGATGATTAT ATTGCCCATG 70 80 

TGGATTATTA TTTTTTGGAA ACACATGGAC AGGACGTTGA CGTTATTATT GATGCAACCG 714 0 

ATAACTTTGA AACACGACAA CTGATTAATG ATTTTGCATA TAAATATCGT ATACCTTGGA 7 20 0 

TTTATGGTGG TGTTGTACAG AG T A CAT AT A CAGAAGCTGC ATTTATACCT GGTAAAACAC 7260 

CTTGCTTTAA CTGTTTGG T A CCACAATTGC CAGCATTAAA TTTAACATGT GATACAGTAG 7 320 

GGGTCATTCA ACCTGCCGTG ACGATGGCAA CAAGTTTACA ATTAAGAGAT GCGATGAAAG 7380 

TATTAACGGA ACAACCAATT GACACAAAAA TAACTTATGG CGATATTTGG GAAGGTAGTC 744 0 

ATTATTCATT TGGTTTCAGT AAAATGCAAC GTTCAGACTG TACAACTTGT GGAGATGTAC 7 500 

CAAGTTATCC GTATTTAAAC AAGAATGAAC AACGTTATGC AACATTGTGT GGTAGAGACA 7 560 

CTGTACAGTA TGAAAATGCA TCAATTACAC ACGACATTCT TGTTCAATTT TTAAAACAAC 76 20 

ATCAGTTAAA TTATCGCAGT AATTCGTATA TGGTTATGTT TGAATTTAAA GGACACCGCA 7680 

TTGTTGCTTT TAAAGGTGGA AGGTTTTTAA TACATGGCAT GACACG CACA TCAGATGCCA 7 74 0 

CACATCTAAT GAATTTATTG TTTGGATAAA AAAAGATAAG ACAAAAGGAG TGTAATATTA 7 8 00 

TGGGCGAACA TCAAAACGTT AAATTGAATC GTACAGTTAA AGCAGCCGTA CTAACGGTAT 7 8 60 

CAGATACTAG AGACTTTGAT ACAGATAAAG GTGGTCAATG CGTGCGCCAA CTATTACAAG 7 92 0 

CAGATGACGT TGAAGTGAGT GACGCACATT ATACAATTGT GAAAGATGAA AAAGTAGCCA 7 93 0 

TCACGACGCA GGTGAAGAAG TGGTTAGAAG AAGATATTGA TGT CATC ATT ACGACTGGTG 8 04 0 

GAACAGGTAT TGCACAACGT GATGTGACGA TTGAAGCAGT AAAACCACTT TTAACTAAAG 8100 

AGAtAGAAGG CTTTGGGGAA TTGTTTAGAT ATTTGAGTTA TGTTGAAGAT GTTGGCACGC 8160 

GTGCATTATT GTCTCGTGCT GTAGCAGGTA CAGTTAATAA TAAATTGATA TTTTCGATTC 8220 

CAGGATCAAC AGGCGCAGTT AAATTAGCAT TAGAAAAGCT CATTAAACCA GAATTAAATC B2 30 

ATCTGATTCA TGAGCTTACA AAATAATTTA TTGATTTGAT TGGCGTTGAA AATCTCCAGA 8 34 0 

TTTACCGCCA GACTTGCTTT CAAGGTAGGT TTCGCCAATA ATCATACCTT TATCAACTGC 84 0 0 

TTTCGTCATG TCGTAAATGG TTAAAGCCGT TGCTGATGCA GCGGTTAAAG CTTCCATTTC 34 6 0 

AACACCGGTT TTGCCAGTTG TAGAGACAGT TGTTTGAATG TTTAAAGTAT AAAGGGGTGC 852 0 

ATTTGTTTCA TCCCAGCTGA AGTGAACATC TATGCCAGTC AATGGTAATG GATGGCACAT 858 0 

CGGAATAAGT GTTGATGTAT TTTTGGCAGC CATAATACCA GCGATTTGAG CAGTGTTCAA 364 0 
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45 



50 



AATGCTTGAA 


TGAGCGACAG 


CAGTTCTTTT 


TGTAATTTGT 


TTGTCTGATA 


CATCGACCAT 


8760 


TTTGGCGTGG 


CCTTGTTGAT 


TAATATGAGT 


AAACTCAGTC 


ATTTTACCCC 


TCCTAGTGCA 


8820 


TCTAGTATAT 


CATGAAAAAA 


TAAAAGTTTT 


GGAGATGATT 


TTTAATGGTA 


GTAGAAAAAA 


8830 


GAAACCCAA? 


CCCAGTTAAA 


GAAGCAATTC 


AACGTATCGT 


TAATCAGCAG 


AGTTCAATGC 


8940 


CGGCAATTAC 


GGTAGCACTT 


GAAAAAAGTC 


TAAATCATAT 


CTTAGCAGAA 


GATATTGTAG 


9000 


CTACTTATGA 


TATACCAAGG 


TTTGATAAAT 


CACCTTATGA 


TGGTTTTGCA 


ATTCGCAGTG 


9060 


TTGATTCACA 


AGGGGCAAGT 


GGTCAGAATC 


GCATTGAGTT 


TAAAGTGATT 


GATCATATTG 


9120 


GTGCAGGTTC 


AGTTTCTGAT 


AAATTAGTTG 


GCGATCACGA 


AGCGGTGCGT 


ATTATGA CTG 


9180 


GAGCACAAAT 


ACCTAATGGC 


GCAGATGCTG 


TTGTTATGTT 


TGAACAAACG 


ATTGAACTAG 


9240 


AAGATACATT 


TACAATTCGT 


AAACCATTTT 


CAAAAAATGA AAATATATCT 


TTAAAAGGTG 


9300 


AAGAAACAAA 


GACAGG CGAT 


GTTGTTCTAA 


AAAAAGGACA 


AGTAATTAAT 


CCAGGGGCTA 


9360 


TCGCGGTCCT 


TGCAACATAT 


GGCTATGCAG 


AGGTTAAAGT 


TATTAAGCAA 


CCGAGTGTCG 


9420 


CTGTTATTGC 


AACAGGAAGC 


GAATTATTAG 


ATGTTAATGA 


TGTATTAGAA 


GATGGGAAAA 


9460 


TTCGTAACTC 


TAATGGCCCA 


ATGATTCGTG 


CCTTAGCAGA 


AAAATTAGGT 


CTTGAAGTTG 


9540 


GTATTTACAA 


AACACAAAAA 


GATGATTTAG 


ATAGTGGCAT 


CCAAGTCGTT 


AAAGAAGCTA 


9600 


TGGAAAAACA 


TGATATCGTT 


ATTACAACGG 


GCGGAGTTTC 


TGTTGGAGAT 


TTTGACTATT 


9660 


TACCTGAGAT 


TTATAAGGCT 


GTAAAGGCGG 


AAGTGTTATT 


TAATAAAGTA 


GCAATGCGTC 


9720 


CTGGTAGCGT 


AACAACGGTT 


GCATTTGTAG 


ATGGaAAGTA 


TTTGTTTGGa 


TTATCTGGAA 


9780 


ATCCATCAGC 


TTGTTTTACA 


GGATTTGAAC 


TATTTGTGAA 


nCCAGCTGTT 


AAACATATGT 


9840 


GTGGCGCACT 


AGAAGTCTTC 


CCGCAAATAA 


TTAAAGCAAC 


ATTAATGGAA 


GATTTTACCA 


9900 


AGGGAAACCC 


ATTCACACGA 


TTTATACGTG 


CTAAAGCAAC 


GTTAACAAGT 


GCTGGAGCTA 


9960 


CTGTAGTACC 


TTCAGGATTC 


AATAAATCAG 


GTGCGGTTGT 


AGCGATTGCA 


CATGCTAACT 


10020 


GTATGGTCAT 


GTTACCAGGA 


GGGTCACGTG 


GTTTTAAAGC 


GGGGCATACA 


GTAGATATTA 


10080 


TATTGACTGA 


ATCTGACGCT 


GCTGAAGAGG 


AACTTCTTTT 


ATGATTTTAC 


AAATTGTAGG 


10140 


TTACAAAAAG 


TCTGGTAAGA 


CAACATTGAT 


GAGGCATATT 


GTCTCTTTCT 


TAAAGTCACA 


10200 


TGGTTATACA 


GTTGCTACTA 


TTAAACATCA 


TGGGCATGGT 


AAGGAAGATA 


TTCAATTACA 


10260 


GGATTCAGAC 


GTCGATCACA 


TGAAGCATTT 


TGAAGCGGGG 


GCAGATCAAA 


GTATTGTACA 


10320 


AGGTTTTCAA 


TATCAGCAAA 


CTGTAACACG 


TGTAGATAAT 


CAAAATCTTA 


CTCAAATTAT 


10380 


TGAAAAATCT 


GTT A CAATTG 


ACACCAATAT 


CGTATTAGTT 


GAAGG CTTTA 


AAAATGCTGA 


10440 
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GAATGTTTGT TATAGCATTA ATGTAAGGGA GCATGAAGAT TTTACAG CAT TTGAGCAATG 10560 

GTTATTAAAT AAAATTAAAA ATGATTGTGA TACACAATTA ACATAGAGGA TTGAAATGAA 10620 

TGAAACAATT TGAAATCGTG ACAGAACCGA TA CAAACAGA A CAAT ATCGT GAATTCACTA 106 80 

TAAATGAATA TCAAGGTGCA GTAGTTGTTT TTACCGG7CA TGTTCGCGAA TGGACTAAAG 10 740 

GCGTCAAAAC GG AATATTTA GAATATGAAG CGTATATTCC AATGGCTGAA AAGAAATTGG 10900 

CACAAATTGG AGATGAAATA AATGAAAAAT GGCCTGGAAC GATAACGAGT ATTGTTCATA 10 860 

GAATAGGGCC ATTACAAATT TCAGATATCG CTGTATTAAT TGCGGTTTCT TCACCGCATC 10920 

GTAAAGATGC CTATCGAGCA AATGAATATG CAATTGAGCG TATAAAAGAA ATTGTTCCGA 10980 

TTTGGAAAAA AGAAATTTGG GAAGATGGTT CAAAATGGCA AGGGCATCAA AAAGGGAATT 11040 

ATGAAGAAGC AAAGAGGGAG GAATAAGAGA GATGAAGGTA CTTTACTTCG CAGAAATTAA 11100 

AGATATATTA CAAAAAGCAC AGGAAGATAT TGTGCTTGAA CAAGCATTGA CTGTACAACA 1116 0 

ATTTGAAGAT TTATTGTTTG AACGTT AT CC GCAAATCAAT AATAAAAAGT TTCAAGTTGC 11220 

TGTAAATGAG GAATTTGTAC AAAAATCGGA TTTCATTCAA CCTAATGATA CTGTTGCATT 11280 

AATTCCACCG GTTAGTGGAG GTTAAGGGAG CATGAAAGCA ATAATTCTTG CAGGTGGTCA 11340 

TTCAGTGCGA TTTGGTAAGC CCAAAGCTTT TGCGGAAGTG AACGGTGAGA C CTTTT AT AG 114 00 

TAGAGTAATT AAGACATTAG AATCAACAAA TATGTTCAAT GAAATTATTA TTAGTACAAA 11460 

TGCGCAATTG GCAACGCAAT TTAAATATCC AAATGTTGTT ATAGATGATG AGAATCATAA 11520 

TGATAAAGGT CCATTAGCAG GAATTTATAC AATCATGAAG CAACATCCTG AAGAAGAATT 11580 

GTTTTTTGTC GTTTCTGTTG ATACACCAAT GATTACTGGT AAAGCTGTAA GCACGTTGTA 1164 0 

TCAGTTTTTA GTTTCTCATC TTATTGAAAA TCATTTAGAT GTCGCAGCTT TTAAAGAAGA 11700 

TGGATGTTTT ATTCCAACAA TTGCATTTTA TAGTCCGAAT GCATTAGGCG CTATAACTAA 11760 

AGCACTACAT TCTGATAATT ACAGTTTTAA AAATGTATAT CATGAATTAT CAACGGATTA 1182 0 

TTTGGATGTA AGGGATGTAG ATGCGCCCTC ATATTGGTAC AAAAATATAA ATTATCAGCA 11880 

TGATTTGGAC GCTTTAATTC AAAAATTGTA AGCTGTTAGG AGGTCCACAA ATGGTAGAAC 11940 

AAATAAAAGA TAAACTAGGA CGTCCCATCC GTGACTTACG GTTATCTGTG ACAGATCGGT 12000 

GTAACTTTAG GTGTGATTAT TGCATGCCTA AAGAGGTATT TGGAGATGAT TTCGTATTTT 12 06 0 

TACCTAAAAA TGAACTTTTA ACGTTTGATG AAATGG CT AG AATCGCTAAG GTATATGCAG 12120 

AATTAGGTGT AAAAAAAATA CGCATTACAG GTGGAGAACC ATTGATGCGA CGGGATTTAG 12180 

ATGTACTTAT AGCTAAATTA AATCAAATCG ATGGTATTGA AGATATTGGT TTGACTACAA 12240 
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20 



ATGTCAGTTT GGATGCTATT GATGATACGC TATTTCAATC AATCAATAAT CGTAATATTA 1236 0 

AAGCGACTAC GATTTTAGAA CAAATTGATT ACGCGACGTC TATTGGTTTG AATGTAAAAG 12420 

TAAATGTTGT TATACAAAAA GGTATTAACG ATGATCAAAT CAT A C CAATG CTTGAATATT 12480 

TTAAAGATAA ACATATAGAG ATTCGATTTA TAGAATTTAT GGATGTTGGT AATGATAATG 12 540 

GATGGGATTT CAGTAAAGTT GTAACTAAAG ATGAAATGCT T A CAATG ATA GAGCAGCACT 12 600 

TTGAAATCGA TCCTGTAGAA CCAAAATATT TTGGGGAAGT AGCAAAATAT TATCGCCATA 12 660 

AGGATAATGG TGTTCAATTT GGTTTGATTA CAAGTGTTTC ACAATCATTT TGTTCTACAT 12 720 

GTACACGCGC AAGGCTGTCA TCAGATGGGA AGTTTTACGG ATGTTTATTT GCAACTGTCG 12 780 

ATGGATTTAA CGTTAAAGCG TTTATTCGTT CTGGCGTGAC CGACGAAGAA TTAAAAGAAC 12 840 

AATTTAAAGC TTTATGGCAA ATAAGAGATG ATCGATATTC AGATGAGAGA ACTGCTCAAA 12 90 0 

CAGTTGCCAA TCGTCAACGT AAAAAGATAA ACATGAATTA TATTGGTGGT TAATGTGTAG 12 960 

GGACCACTAC ATATTAAATC ATTAGAGATG TTTTAATATT TCTGTCTTAC TCCCTAAAAT 13 020 

ACAATATTAT TTATTAAAGT AAAAACGGTC ATATCTATGC CAGATTTAAT AGAAATGATC 13 08 0 

25 GTTTTTAAAG TTTTTACAAG TTGGCGGGGC CCCAACACAG AAGCTGACAG AAAGTCAGCT 1314 0 

T ACAAT AAT G TGCAAGTTGG CGGGGCCCCA ACATAGAGAA TTTCAAAAAG AAATTCTACA 132 00 

GACAATGCAA GTTGGGGAAC GGGGCCCCAA CACAGAAGGT GACGAAAAGT CAGCATACAA 13 260 

TAATGTGCAA GTTGGCGGGG CCCCAACATA GAGAATTTCA AAAGAAATTC TACAGACAAT 13 320 

GCAAGTTGGG GATCAACGAA ATAAATTTTA TGAGAATATC ATTTCTATCC CACTCTTAAG 133 80 

AATCACTACA TAATAAATCT TTAGTGGTTC TTTAACATTG ATGTCACACT CCATGCCATT 13440 

GAGTTGTAAT ATATCTTTTT TAGGTATAAA TGTTGTCGAA TAAACAACAA GTTGTCCAAA 13500 

AGATATAAAT CTAAACAAGA TATAGCCAGC AATTTAATAT TTGTAATAGA TAAAATGCTA 13 560 

AG TTTGAT AT ATAATAAATT TAAGTAATTG TATAATAATA TGAATTACAA ACATCTAAGA 1362 0 

AGAAACATAG GAGGCATCAT ATTATGAGTA ATAAAGTTCA ACGTTTTATA GAAGCAGAAA 136 80 

GGGAGTTAAG TCAGTTAAAG CACTGGTTAA AAACAACACA TAAGATTTCA ATTGAAGAAT 13740 

TTGTAGTCCT TTTTAAAGTG TATGAAGCTG AAAAGATTAG CGGTAAAGAA TTGAGGGATm 13 800 

CATTACATTT TGAAATGCTA TGGGATACAA GTAAAATCGA TGTGATTATC CGTAAAaTCT 13860 

ATAAAAAAGA GCTTATTTCT AAATTGCGTT CTGAAACGGA TGAAAGACAA GTATTCTATT 13 920 

TCTATAGTAC TTCTCAAAAG AAATTGTTAG ATAAAATTAC TAAAGAAATA GAAGTGTTAA 13980 

GCGTTACAAA CTAAAAACTT aAAAAgcaTG CCAATCTCTA TTCATCATAA TTGCGTCTTG 14 040 
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GTTCATGGCA TTTCTAGTTA CATGACGTCC ATGAATTAAG AAGTAAACAA GCATAGTAAT 
GATTGCTAAA GCGGCCATAA AGCCGAAGAT TTCACTATAT GAAAACATAT GAGTAAATAA 
CCCAAGGAAT GATGGACCGA AGCCGACACC TGCATCTAGA CCAACGTAAA AAGTAGATGT 
CGCGATACCA TATTTAATCG GGGGTGAGAC TTTTATCGCA ATAGATTGCA TTGCAGATGA 
TAAATTTCCA TACCCTAAAC CTAGGCAAGC ACCAGCAAGT AATA7TAACC AGCTTTGATA 
GCTTGAAATT AAGCATACAA ATGAAAGGAA AAGCATGATA AATGCTGGGT AG ACAATAA T 
A1TTTCATTT TTATCATCCA TCAATCTACC AGCAATAGGT CTAGTAATTA ACGATGCTAT 
AGCATAGCAA ATAAAGAAAT AGCTTGCTGC AGTGACTAGG TGTCGCTCTA AAGCAAATGC 
TTGTAAATAA GTTAGGATGG ACGCATAGGT AACGCCAATT AAAAGCATAA TTACAGCAAC 
AGGAATGGCC TCTTTTGCAA TAAATTGATG AATACTAAAT CTTGGTTTAT CAATGACATT 
^ AGTTTCAGTT TTGTTATTTG TTACTTCGAA ATCAACTTTT ATAAATAATG AGATAATGAG 

TCCGAGTATG CCTAATATGA CACAAATAAT AAACAGTAAG TCAATTGCGT ATTTTGTAAT 
AAGTAACATG CCTAGAAATG GGCCAATCGC TGTACCTAAT ACTAAACTTA AGGAAAATAA 
25 ACTGATGCCT TCACTTTTTC TATTAACAGG GGTAACGTAT GCCGCAATAG TACCTGTTGC 

AGTTGTCACA ACTGCAGTTG CGATACCGTT TATGAGACGT A C AAA G ATT A AAAAAGCTAA 
AGAT C CAT CA ATAAAATAAA GTAATTGCGT GATAATTAAA GCAATTAAAC CAATAAATAA 
30 TAATCGTTTA GG7CC rATTT SATTTACAAA TTTACCTGTA GCAAATCGA 

(2) INFORMATION FOR SEQ ID NO: 45: 



35 



40 



45 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9072 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

GAGAGTCAAT GGCAAGAAGA ATATAAATAT TTGAGAGCGT TAATCTTTAA TGAAACAGAA 60 

TTAGAGGAAG CGTATAAATG GATGCATCCT TGTTACACGT TGAATAATAA AAATGTAGTA 12 0 

CTTATCCATG GCTTCAAAAA TTATGTTGCA CTATTATTTC ATAAAGGTGC CATTTTGGAG 180 

GATAAATATC ATACACTCAT TCAACAGACT GAAAAGGTGC AAGCAGCTCG TCAGTTACGA 24 0 

TTTGAAAATT TAACAGAGAT TCAAGCACGT ACCGAAGAAA TTAAATATTA TCTAGCCGAA 30 0 

GCAATTAAAG CTGAAAAAGC TGGTAAAAAA GTTGAAATGA AGAAAACAGA GGAATATGTT 3 60 
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AAATTAACGC CAGGCAGACA ACATCAATAT ATATATCATA TTGGACAAGC TAAACGCAgT 
GgAACAAGAC AAAAGCGTGT TGAAAAGTAT ATTAACCAAA TACTAGAAGG TAAAGGGATG 
CATGATAAGT AATTAATGAG TAAAGCATAC CGGTTATACA ACAACATACA AGATGACACG 
AAACAACCAA TGGCTCATGC TGTTGGTTGT TTTTTTAGGT GTGTCTGTCA TGGGCAACAC 
TTTGACGTTG GAATTCCGTT ACAGGCTTGG GAGTAGAAAA TGTTAGCAAA AGGCAAGGGT 
GTCTACAATG AATGATGAAG ATATTAAAAT ATAAGGATGA CTTTGTGAGT GGCGGATGGG 
CGGTTGTCCG TCTGTAACAA TGGATGCGTG TGCATTATTA CAAAAATTCG ACTTTTGTAA 
TAATATTTCA CATTTTCGAC ACTTTTTTGC TATAAAACAA CCAATTGAGC GATAATAAAT 
TCGCTTTTAA AAAATATGAG TTATCTATTT AGTTGCCAAA GATAAAATAA TAATGTTTAA 
TAACATCATA TAGAGTATGT TAGTTTTAAA TGTCGAATAT ACGAATGTGc AAACAAAGTA 
ATCGGTAGAA ATTCAACATA CATAGCGCCG TTTACTGTTA AGTATTCACA TTACAGATGA 
AAAATATAAA ATTCTACATA ATCAAGACCA TGATGTGTAC TTGTTTAACT TATGACTCTA 
TTTGTTTAAC AATTGCGATA ATGGTCTTTT TATTTTATGC GTATCATTCG TCATATTTTT 
TATGAGGAAG GAGAAATGAT TATGTTAAGT ATTAAGCATT TAACGAAAAT TTA1TCTGGT 
AATAAAAAGG CAGTAGATGA CATCTCTTTA GATATTCAAT CTGGGGAATT TATCGCATTT 
ATTGGAACCA GTGGAAGTGG CAAAACGACT GCTTTAAGAA TGATAAACCG TATGATTGAA 
GCGACAGAAG GACAAATTGA AATTGATGGT AAAGATGTTC GGAGTATGAA TCCTGTCGAA 
TTGCGTAGAA ATATTGGCTA TGTTATTCAA CAAATTGGCT TAATGCCTCA TATGACGATT 
AAAGAGAATA TTGTGTTGGT ACCCAAATTG TTGAAATGGA CTAAAGAGGA AAAGGATAAA 
CGTGCAAAGG AATTAATTAA ACTTGTGGAT TTACCGGAGT CATTTTTAGA GCGTTATCCA 
GCAGAACTAT CAGGTGGGCA ACAACAACGT ATCGGTGTTG TAAGAGCACT TGCGGCCGAA 
CAAGATATTA TTTTAATGGA TGAACCTTTT GGTGCATTGG ATCCTATTAC GAGAGATACG 
TTACAAGATT TAGTTAAAAC GTTACAACGA AAATTAGGCA AGACGTTTAT CTTTGTAACA 
CATGATATGG ATGAAGCGAT TAAATTAGCA GACAAAATTT GTATTATGTC AGAAGGTAAG 
GTGGTGCAAT TTGATACGCC AGACAATATT TTAAGACATC CCGCAAATGA TTTTGTACGT 
GATTTTATAG GACAAAATAG ACTGATTCAA GACCGTCCCA ATGACAAGAC TGTAGAAGGT 
GTAATGATTA AACCAATCAC GATACAAGCA GAAGCAACAC TGAATGACGC CGTTCATATT 
ATGAGACAAA AACGTGTTGA TACTATTTTT GTAGTAGATA GTAATAACCA TTTACTAGGT 
TTCTTAGACA TTGAAGATAT AAATCAGGGT ATACGTGGAC ACAAAAGTTT ACGAGACACC 



4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
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ATTTTAAAAA GAAACGTTAG GAATGTACCT GTCGTAGATG ATCAACAGCG TTTAGTAGGA 22 8 0 

CTGATTACGC GTGCCAATGT TGTTGATATT GTATATGACA CGATTTGGGG CGATAGTGAG 2 34 0 

5 GATACAGTGC AAACAGAACA TGTGGGGGAA GACA~TGCGT CCTCAAAAGT GCATGAGCAA 24 0 0 

CACACTACTA ATGTCAAAGT ACGTGACATA GGAGATGATA AATCATGATT GAGTTCCTAC 24 60 

ATGAACATGG TGGACAGTTG ATGTCGAAAA CACTGGAACA TTTCTATATT TCTATAGTGG 2 520 

W 

CATTATTACT TGCCATCATT GTTGCAGTAC CTATAGGCAT TTTATTATCA AAAACAAAGC 2 580 

GAACTGCCAA TATTGTATTA ACTGTGGCAG GTGTCTTACA AACTATTCCA ACACTAGCTG 264 0 

TACTTGCTAT TATGATACCG ATTTTTGGTG TTGGTAAAAC GCCTGCAATT GTAGCGCTAT 2700 

15 

TTATTTATGT ATTATTACCT ATTTTAAATA ACACGGTACT CGGTGTTCAA AATATTGATA 276 0 

GCAACATTAA AGAAGCTGGA AAAAGTATGG GAATGACACA ATTTCAATTG ATGAAGGATG 28 20 

^ TTGAATTGCC GTTAGCATTG CCGCTTATCA TTGGTGGCAT TCGTTTGTCA TCTGTGTATG 2 8 80 

TAATTAGTTG GGCTACACTT GCAAGTTATG TAGGTGCGGG TGGATTAGGT GATTTCATTT 2 94 0 

TCAATGGTTT AAATTTATAT GATCCACTGA TGATTGTAAC TGCAACGGTA CTCGTTACTG 3 000 

25 CACTAGCATT AGGTGTTGAT GCCTTATTAG CTTTAGTTGA AAAATGGGTA GTTCCCAAAG 3 06 0 

GCTTAAAAGT ATCTGGATAA TTAGGAGGCT AAGATAATGA AGAAAATTAA ATATATACTT 312 0 

GTCGTGTTTG TCTTATCGCT TACCGTATTA TCTGGATGTA GTTTGCCCGG ACTAGGTAGT 318 0 

30 AAGAGCACGA AAAATGATGT CAAAATTACA GC ATT AT CAA CAAGCGAATC GCAAATTATT 3 24 0 

TCACATATGT TACGGTTGTT AATAGAGCAT GATACACACG GTAAGATAAA GCCAACATTA 3 3 00 

GTAAATAATT TAGGGTCAAG T A CG ATT CAA CATAATGCCT TAATTAATGG GGATGCTAAT 3 3 60 

35 ATATCAGGTG TTAGATATAA TGGCACAGAT TTAACGGGAG CTTTGAAGGA AGC AC CAATT 34 20 

AAAAATCCTA AGAAAGCAAT GATAGCAACA CAACAAGGAT TTAAAAAGAA ATTTGATCAA 34 80 

ACGTTTTTTG ATTCGTATGG TTTTGCGAAT ACGTATGCAT TCATGGTAAC GAAGGAAACC 354 0 

40 

GCTAAAAAAT ATCATTTAGA GACAGTTTCA GATTTAGCAA AG CAT AG T AA AGATTTACGT 3 6 00 

TTAGGTATGG ATAGTTCATG GATGAATCGT AAAGGCGATG GCTATGAAGG ATTTAAAAAA 3660 

GAGTATGGTT TTGACTTTGG TACAGTGAGA CCAATGCAAA TAGGTCTAGT CTACGACGCA 3 72 0 

45 

TTAAACTCAG AGAAGTTAGA CGTTGCATTA GGTTATTCTA CAGATGGTCG AATTGCGGCG 3 78 0 

TATGATTTGA AAGTACTTAA AGATGATAAA CAATTTTTCC CACCTTATGC TGCGAGTGCT 3 84 0 

5Q GTTGCAACAA ATGAATTATT ACGGCAACAC CCAGAACTTA AAACGACGAT TAATAAG TTG 3 900 

ACAGGAAAGA TTTCGACTTC AGAGATGCAA CGCTTGAATT ATGAAGCGGA TGGTAAAGGT 3 96 0 
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AAAGGTGGTC ATAAGTAATG GAAGGTAATT TATTACAGCA ATTATTCAAT TATTATGTTA 4 0 80 

CGAACTTTGG TTATCTATGG GATTTATTTT TCAAACACTT ATTAATGTCT GTCTATGGTG 414 0 

5 TGCTGTTTGC AgCTTTAATT GGTATTCCAT TGGGAATCTT GCTTGCaAGA TACACAAAAC 420 0 

TTTCTGGATT TGTAATTACA ATTGCAAATA TAATTCAAAC AGTTCCAG TC ATTGCAATGT 42 6 0 

TAGCTATTTT AATGTTAGTC ATGGGCTTAG GTTCAGAAAC AGTAGTT7TA ACAGTGTTTT 432 0 

10 

TATATGCGTT ACTTCCAATT ATAAAAAACA CTTATACTGG TATAGCTAGT GTTGATGCGA 4 3 80 

ATATTAAGGA TGCTGGCAAA GGTATGGGAA TGACACGCAA TCAAGTGCTA CGAATGATTG 44 4 0 

AATTACCGTT ATCTGTTTCG GTTATTATCG GTGGCATTCG TATTGCCTTG GTTGTTGCGA 450 0 

15 

TAGGTGTTGT TGCCGTTGGA T CATTT AT AG GAGCACCTAC GCTTGGTGAC ATTG TGATTC 4 56 0 

GTGGTACAAA TGCGACGGAT GGCACAACGT TTATTTTAGC AGGTGCGATT CCGATTGCTA 4 62 0 

2Q TCATTGCAAT CGTCATTGAT GTACTATTAA GATTTTTAGA AAAACGATTA GACCCAACAA 468 0 

CACGACATCG TAAAAATCAA TCTAATCATC GGCCGCAAAG TATTAATATG TAATAGTAGA 4 74 0 

AGATGTTTAT AATTTAGCGA TTTCGTTTCA TGATTTATAA AAAATGAGGC TACT CAAGG A 4 80 0 

25 GCTCAAATAA TCTTTGAGTA GCCTTTTTAT AGGTTGTGTT TGTATGCGTT TACACTAAAA 4 360 

TAGCAATTAT TATCATGAAA G TTTTTGG AT AAAAAGCGTT AATTATTGTA AAAATACTAA 4 920 

AAAATGAGAT GTTTTATTTA TAATTTTCTG CAAATTTATG ATATTGTTTC TTAAT AT AT C 4 9 80 

30 ATATTAAAAA TTTGTTTTTC TTAAACATAG GAGGCTTATC TAATTCATGG ACACATCAAA 5 04 0 

ACAATTTAGA GGTGACAACC GATTGCTTTT GGGTATCGTT TTAGGGGTTA TTACCTTTTG 5100 

GCTATTCGCG CAGTCACTTG TTAATCTTGT TGTCCCATTA CAATCAACAT ATAGTAGTGA 5160 

35 

CGTTGGAACG ATAAATATCG CTGTTAGCTT ATCTGCCTTA TTTGCTGGTT TGTTT AT CGT 5220 

AGGTGCTGGT GATGTTGCTG ATAAATTTGG TCGCGTCAAA ATTACTTATG TAGGATTGAT 528 0 

ATTAAATGTT GTAGGTTCAT TACT CAT CAT CATTACACCT TTGCCAGCAT TTTTAATTAT 534 0 

40 

AGGTAGAATA ATTCAAGGTT TGTCTGCAGC ATGTATTATG CCATCAACAC TTGCTATTAT 54 0 0 

TAACGAATAT TATATTGGTA CAAGAAGACA ACGTGCCTTA AGCTATTGGT CTATTGGTTC 54 6 0 

TTGGGGTGGT AGTGGTATTT GTACGTTGTT TGGTGGCTTA ATGGCTACAT ATATAGGTTG 5 52 0 

45 

GCGTTCAATA TTTGTTGTTT CAATTCTATT AACATTATTA GCAATGTACT TAATCAAACA 5 5 80 

TGCACCTGAG ACT AAA 32 AG AACCAATCAA AGG T ATG AAA GCAGAAGCTA AAAAGTTTGA 564 0 

50 CGTTATTGGT TTAGT CATTT TAGTAGTGAC GATGTTAAGT TTAAATGTAA TCATCACACA 570 0 

GACGTCTCAT TTTGGTTTAG TTTCACCGTT AATTCTAGGT TTAATTGTTG TGTTT ATCTG 5 76 0 
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AATTTTTAAA AATAGAGGAT ACAGTGGTGC AACTATTTCA AACTTCTTAT TAAATGGTGT 58 8 0 

AGCAGGTGGT GCACTTATCG TTATTAACAC GTATTATCAA CAACAATTAG GATTTAATTC 594 0 

5 TTCGCAAACG GGTTATATTT CATTAACGTA TTTAATAACA GTGTTGTCAA TGATTCGTGT 6000 

AGGTGAAAAG ATTTTATCTC AACATGGTCC GAAGCGCCCA CT ATT ACT AG GAAGTGGCTT 606 0 

TACAGTGATT GGGTTAATCT TATTGTCGTT AACATTTTTA CCAGAAGTGT GGTATATCAT 612 0 

10 ATCTAGTATA GTTGGATATT TATTGTTTGG TACTGGTTTA GGATTATATG CTACACCATC 6180 

AACTGATACA GCAGTTGCTA GTGCGCCAGA TGATAAGTCG GGTGTTGCTT CAGGTGTGTA 6 24 0 

TAAAATGGCG TCATCATTAG GAAATGCATT TGGAGTAGCA GTATCTGGTA CGG TTTAT AC 6 3 00 

is 

TGTGTTAGCA GCTAATTTAA ATTTGAACTT AGGTGGTTTC ACAGGTATGA TGTTTAATGC 63 6 0 

CTTGCTAGCA ATTGTTGCAT TTTTAGTCAT TTTACTATTA GTTCCTAAAA ATCAAACGAA 64 2 0 

TTTGTAAAAC TGAAATGAAA GCAAGTTATT ATGTAGGGAT TTTAAAGGAA ATTTTG TGAA 64 BO 

20 

AGTAAGTTTA TCATACACAC TTAATGTTGC GTATTGACGT TTAATGTTAG GTGTGTTCTT 6 54 0 

TTATAGACGA TAAAAGCTGT GTGCATATTA AGCGAATGAT TTTCAAATTG ACGCTAATAT 66 00 

25 GCGAAAGTAG TATTTTTAAA ATGAACAACA ACGATGAAGA GGGGTTTATA GGATGAAAAT 666 0 

TGCAATTGCT GGATCGGGTG CATTAGG TAG TGGCTTTGGT GCCAAACTAT TTCAAGCAGG 6 72 0 

ATATGATGTC ACACTTATTG ACGGATATAC ATCTCATGTT GAAGCGGTTA AG CAA CAT GG 678 0 

30 ATTAAATATA ACGATTAATG GAGAGGCATT CGAG TT AAA C ATTCCGATGT ATCATTTTAA 6 84 0 

TGATCAACCG GACGAAAGCA TTTACGATGT TGTCTTTCTA TTTCCAAAGT CTATGCAATT 6 90 0 

AAAAGAAGTG ATGGAAGATA TGAAGCCACA TATTGATAAT GAAACGATCG TCGTATGTAC 6 96 0 

35 GATGAATGGT CTGAAGCATG AAGAAGTCAT TGCGCAGTAT GTTGCTCAAT CACAAATTGT 702 0 

CAGAGGTGTT ACGACTTGGA CGGCAGGTCT TGAAAGCCCT GGACACAGTC ATTTACTTGG 70 8 0 

TAGTGGACCA GTTGAAATAG GTGAACTAGT GGATGAAGGT AAAGAAAATG TTATAAAAGT 714 0 

40 

TGCTGATTTA CTTAACGAAG CGGAATTGAA TGGTGTCATT AGTAAAGATT TATACCAATC 72 0 0 

GATTTGGAAA AAGATTTGTG TTAATGGTAC GGCAAATGCA TTAAGCACAG TGTTGGAGTG 7260 

TAATATGGCA TCGCTGAATG AAAGTAGTTA TGCGAAGTGT TTGATTTATA AATTAACGCA 7320 

45 

AGAAATAGTG CATGTAGCGA CGATTGATAA TGTTCATTTA AATGTTGATG AAGTATTTGA 73 8 0 

ATATTTAGTT GATTTAAATG AAaAAGTTGG TGCGCATTAT CCATCCATGT ATCAAGATTT 7440 

AATTGTTAAT AATAGAAAAA CTGAAATTGA TTATATTAAT GGCGCAG TTG CAA CATTAGG 7 500 

50 

TAAACAACGT CaTATTGAAG CGCCAGTCAA TCGCTTTATT ACTGATTTAA TTCATACTAA 756 0 
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CAATCACGTG ATATTACGGT CATTATTAAG ATTGAAATGT AATAAATAAA GAACAGCAGT 76 9 0 

AAGGTACTTT CAAATTGAAA TGATCTTGGT GCTGTTTTTC TTGATTGATC TTCGTCATAA 7 74 0 

^ TTCAGATTTG TCATAGGcTA CGACATACTA TTAGTATTTA CTAGACAGTT TTTACGACGA 78 00 

CACTTTGAAA AATTTTGAGG CAAATCATTT GGAAGTCTCA CGTGAATTTT GTAAACTCAT 76 60 

CAAGCAAGTA ATTATATTAA AAAGACAAAT AGA3AAAAGG TGTTTATAAT GAGTAAAATT 792 0 

to 

TTTGTAACTG GTGCAACGGG CCTTATTGGC ATTAAATTAG TTCAAAGACT AAAAGAAGAG 7980 

GGGCATGAGG TTGCTGGTTT TACTACATCT GAGAATGGTC AACAAAAGCT AGCTGCTGTT 804 0 

AATGTAAAAG CATATATTGG TGATATATTA AAAGCTGATA CTATTGATCA AGCGTTAGCA 8100 

15 

GATTTTAAAC CAGAAATCAT TATCAATCAA ATTACGGATT TAAAAAATGT TGATATGGCA 816 0 

GCAAATACGA AAGTACGTAT TGAAGGTTCT AAAAACCTAA TTGATGCGGC GAAAAAGCAT 822 0 

20 GACGTTAAGA AAGTAATTGC CCAAAGTATT GCCTTTATGT ATGAACCTGG CGAAGGATTA 82 8 0 

GCAAATGAGG AAACT7CACT TGATTTTAAC TCAACTGGCG ATAGAAAAGT AACGGTTGAT 8340 

GGTGTGGTTG GTTTAGAAGA AGAAACGGCT CGTATGGATG AATACGTTGT TTTACGTTTT 84 00 

25 GGCTGGTTAT ATGGCCCAGG TACTTGGTAC GGAAAAGATG GCATGATTTA TAATCAATTT 84 6 0 

ATGGATGGTC AAGTGACACT TTCAGATGGC GTAACATCAT TTGTGCATCT TGATGATGCA 8 52 0 

GTTGAAACAT CTATTCAAGC TATTCATTTT GAAAATGGTA TCTATAATGT AGCAGATGAT 8 5 80 

30 GCACCTGTTA AAGGTTCTGA ATTTGCAGAA TGGTATAAAG AACAACTTGG TGTTGAACCA 864 0 

AATATTGATA TTCAACCTGC GCAACCATTT GAACGTGGCG T AAG CAATG A GAAGTTTAAA 8 7 00 

GCGCAAGGTG GTACTCTGAT TTATCAAACT TGGAAAGATG GCATGAATCC AATTAAATAA 8 76 0 

^ 5 TAATTTATCC GTTTAATATA CAAAGAATAA AGACTTGGTC GAATCGTGGA TGATATATTA 8 820 

TCAAACGCAC GGCTCGAACA AGTCTTTTTT ATTATGTCTT CGTTATCTTT GTATGAAGGA 888 0 

ATAACAGAAT TACAATTAAT GTACTGAATA ATGCAATTAA TGTTGTGATT AGTGCTAATT 894 0 

40 

TAATTTCTAT TGGTAGCCAA GTCAGTACAA AAGACCAATT ATTGCTACCG AGAATGAGAT 9000 

ATGGTAATGC ATATAATATG AGCGCTAAAG CGATACATAT A CATAATG AT AACCAACTCA 906 0 

ATACAGCAAT CC 9° 72 

45 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16826 base pairs 
so (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

G7GGAACAGC TGTAACTATA TCATTTCTTT CAACATTTAT TGGGAAAATG TTAGCTACAT 6 0 

5 TTCTATATCC GATTAATAAT GTAGTACTTT CATATATnTC TGTAAATGAA AGTGACAATA 12 0 

TAAAGAAGCA ATATTTGaAA ACTAATCTAA TTGCTATAGC TGCCCTATGT TTAGTCATGA 180 

TTATATGTTA TCCAATTACA ATAATTATTG TCTCTTTACT GTATAACATT GATTCAAGTT 24 0 

W TATATTCGAA GTTTATTATT TTAGGTAATA TAGGTGTTTT ATTCAATGCA GTGAGTATTA 300 

TGATCCAAAC TTTAAATACA AAACACGCAT CAATAACATT ACAAGCGAAT TATATGACGC 360 

TTCACACGAT TACATTTATA TTCATAACTA TTTTAATGAC AATTGCGTTT GGTCTAAATG 4 20 

15 

GATTCTTTTG GACAACGCTG TTCAGCAACA TTATTAAGTA TGTGATTTTA AA7ATTATAG 4 80 

GTTTAAAGTC TAAATTCATT AATAAAAAGG ACGTCGATTA GATGAGTGAA AAAAAGATTT 54 0 

^ TGATTTTATG TCAGTATTTT TATCCGGAAT ATGTATCTTC TGCGACGTTA CCAACTCAAT 600 

TGGCGGAAGA TTTAATTGCG AATCACATTA ATGTCGATGT CATGTGTGGA TGGCCATATG 6 60 

AATATAGTAA TCATAAACAG GTTTCTAAAA CCGAGATGCA TCGTGGTATT CGCATTCGAC 720 

2S GTCTCAAGTA TTCGAGG TTT AATAACAAAA GTAAGGTTGG AAGGATCATC AATTTCTTTA 7 80 

GTTTATTTTC AAAATTCGTG ATTAATATAC CTAAAATGTT GAAATATGAT CAGATTCTTG 84 0 

TTTACTCTAA TCCACCAATC TTGCCATTAA TACCAGACGT TTTACACAGA CTGCTTAAGA 900 

30 AAAAATATTC TTTTGTGGTG TATGATATAG CACCTGATAA TGCGATTAAG ACAGGTG CAA 960 

CTCGTCCAGG TAGCATGATT GATAAGCTGA TGCGTTACAT TAATAGACAT GTCTACAAGA 1020 

ATGCTGAAAA TGTCATTGTC CTTGGTACGG AAATGAAAAA CTACTTACTA AATCATCAAA 1080 

35 TTTCTAAAAA TGCTGACAAT ATCCATGTGA TTCCTAACTG GTATGACATG CGTCAATTAC 114 0 

AAG^CAATCG TATCTATAAT GACACATTTA AAGCTTACCG TGAGCAATAC GACAAAATTT 1200 

TATTGTATAG CGGTAATATG GGGCAGTTAC AGGATATGGA GACACTTATC TCATTTTTAA 126 0 

40 

AATTAAATAA GGATCAGTCT CAAACGTTAA CAA7ACTTTG TGGTCATGGT AAGAAATTTG 132 0 

CAGATGTCAA AACGGCAATA GaAGACCATC GTATTGAAAA TGTTAAAATG TTTGAGTTTT 13 80 

TAACAGGTAC AGACTATGCT GACGTATTAA AAATTGCGGA TGTATGTATT GCATCGCTGA 144 0 

45 

TTAAAGAAGG CGTCGGTTTA GGCGTGCCGA GCAAGAATTA TGGCTATCTT GCAGCTAAGA 1500 

AAGCGTTGGT ACTCATCATG GATAAGCAAT CTGATATCGT TCAACATGTT GAACAATATG 156 0 

so ATGCGGGTAT CCAAATTGAT AATGGCGATG CACATGCCAT TTATAACTTC ATCAACACTC 162 0 

ACTCGAGTAA GGAATTGCAC GAGATGGGTG AGCGCGCACA TCAACTGTTT AAAGATAAAT 168 0 
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AAGCGATTAT TCGATGTAGT GAGTTCAATA TATGGTTTAG TAGTTTTAAG TCCGATTCTG 1800 

TTAATTACAG CATTACTAAT TAAAATGGAa TCACCTGGAC CAGCCATTTT CAAACAAAAA 18 60 

5 AGACCGACGA TTAATAATGA ATTGTTTAAT ATTTATAAGT TTAGATCAAT GAAAATAGAC 192 0 

ACACCTAATG TTGCAACTGA TT7AATGGAT TCAACATCGT ATATAACAAA GACAGGGAAG 19 80 

GTCATTCGTA AGACCTCTAT TGATGAATTG CCACAATTAT TGAATGTTTT AAAAGGAGAA 2 04 0 

10 

ATGTCAATTG TAGGTCCTAG ACCAGCGCTT TATAATCAAT ACGAATTAAT CGAAAAACGT 2100 

ACAAAAG CGA ACGTGCATAC G ATT AG A CCA GGTGTGACAG GACTAGCTCA AGTGATGGGG 2160 

AGAGATGATA TCACTGATGA TCAAAAAG7A GCGTATGATC ATTATTACTT AACACATCAA 2220 

15 

TCTATGATGC TTGATATGTA TAT CAT A TAT AAAACAATTA AAAATATCGT TACTTCAGAA 22 8 0 

GGTGTGCATC ACTAATGAGA AAAAATATTT TAATTACAGG CGTACATGGA TATATCGGTA 234 0 

ATGCTTTAAA AGATAAGCTT ATTGAACAAG GACATCAAGT AGATCAAATT AATGTTAGGA 24 0 0 

20 

ATCAATTATG GAAGTCGACC TCGTTCAAAG ATTATGATGT TTTAATTCAT ACAGCAGCTT 24 60 

TGGTTCACAA CAATTCACCT CAAGCAAGGC TATCTGATTA TATGCAAGTG AATATGTTGC 252 0 

25 TGACGAAACA ATTGGCACAA AAGGCTAAAG CTGAAGACGT TAAACAATTT ATTTTTATGA 2 58 0 

GTACTATGGC AGTTTATGGA AAAGAAGGTC ATGTTGGTAA ATCAGATCAA GTTGATACAC 264 0 

AAACACCAAT GAACCCTACG ACCAACTATG GTATTTCCAA AAAGTTCGCT GAACAAGCAT 27C0 

30 TACAAGAATT GATTAGTGAT TCGTTTAAAG TAG CAATTGT GAGACCACCA ATGATTTATG 276 0 

GTGCACATTG CCCAGGAAAT TTCCAACGGT TAATGCAATT GTCAAAGCGA TTGCCAATCA 2 82 0 

TTCCCAATAT TAACAATCAG CGCAGTGCAT TATATATTAA ACATCTGACA G CATTTATTG 2880 

35 ATCAATTAAT ATCATTAGAA GTGACAGGTG TGTACCATCC TCAAGATAGT TTTTACTTTG 2 940 

ATACATCGTC AGTAATGTAT GAAATACGTC GCCAATCACA TCGTAAAACG GTATTGATCA 3000 

ACATGCCTTC AATGCTAAAT AAGTATTTTA ATAAGTTGTC GGTCTTTAGA AAATTATTCG 3060 

40 

GCAATTTAAT ATACAGCAAT ACGTTATATG AAAATAATAA TGCACTTGAA ATTATTCCTG 3120 

GAAAAATGTC ACTTGTTATT GCGGACATCA TGGATGAAAC GACAACCAAA GATAAGGCAT 3180 

AAGTCATCTA TTAAATAAAA T CAA CAT ACA AATCGTTTTA TTTGGAGGTT ATAGTATGAA 324 0 

45 

GTTAACAGTA GTTGGCTTAG GTTATATTGG TTTACCAACA TCAATTATGT TTGCAAAACA 3 3 00 

TGGcGTCGAT GTGCTTGGTG TTGATATTAA TCAGCAAACG ATTGATAAGT TACAAAGTGG 3360 

50 TCAAATTAGT ATTGAAGAAC CTGGATTACA AGAGGTTTAT GAAGAGGTAC TGTCATCGGG 3420 

AAAATTGAAG GTATCTACAA CGCCAGA7GC ATCTGATGTT TTTATCATTG CCGTTCCGAC 34 80 
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TAGTATTTTA TCATTTTTAG AAAAAGGAAA TACCATTATT GTAGAGTCGA CAATTGCGCC 3 600 

TAAAACGATG GATGATTTTG TAAAACCAGT CATTGAAAAT TTAGGGTTTA CAATAGGTGA 3660 

* AGATATTTAT TTAGTGCATT GTCCAGAACG TGTACTGCCA GGAAAAATTT TAGAAGAATT 3 72 0 

AG TT CAT AA C AATCGTATCA TTGGCGGTGT GACTGAAGCT TGTATTGAAG CGGGTAAACG 37 80 

TGTCTATCGC ACATTCGTTC AGGGAGAAAT GATTGAAACA GATGCACGTA CTGCTGAAAT 384 0 

10 

GAGTAAGCTA ATGGAAAACA CATATAGAGA CGTGAACATT GCTTTAGCTA ATGAATTAAC 3900 

AAAAATTTGC AATAACTTAA ATATTAATGT ATTAGATGTG ATTGAAATGG CAAACAAACA 3 960 

TCCGCGTGTT AACATCCATC AGCCTGGTCC AGGTGTAGGC GGTCATTGTT TAGCTGTTGA 4 02 0 

15 

TCCGTACTTT ATTATTGCTA AAGACCCTGA AAATGCAAAG TTAATTCAAA CTGGACGTGA 4080 

AATTAATAAT TCAATGCCGG CCTATGTTGT TGATACAACG AAGCAAATCA TCAAAGTGTT 414 0 

2Q GAGCGGGAAT AAAGTCACAG TATTTGGTTT AACTTATAAA GGTGATGTTG ATGATATAAG 4 200 

AGAATCACCA GCATTTGATA TTTATGAGCT ATTAAATCAA GAACCAGACA TAGAAGTATG 4260 

TGCTTATGAT CCACATGTTG AATTAGATTT TGTGGAACAT GATATGTCAC ATGCTGTCAA 4 32 0 

25 AGACGCATCG CTAGTATTGA TTTTAAGTGA CCACTCAGAA TTTAAAAATT TATCGGACAG 4 3 BO 

TCATTTTGAT AAAATGAAGC ATAAAGTGAT TTTTGATACA AAAAATGTTG TGAAATCATC 444 0 

ATTTGAAGAT GTATCGTATT ATAATTATGG CAATATATTT AA TT TT AT CG ACAAATAAAA 4 5 00 

30 TGTGTCAAAC TAGGGCATAC ATGATTAAGG AAAGATAAGC TGTCATGTGT TTGAACTTCA 4 560 

GAGAGGATAA TGTTATGAAA AAAATTATGG TTATTTTCGG TACGAGACCC GAAGCAATAA 4 620 

AAATGGCACC ATTAGTAAAA GAAATTGATC ATAATGGGAA CTTTGAAGCG AACATTGTGA 4 6 80 

35 TT A CAG CACA ACATAGAGAT ATGTTAGATA GTGTGTTAAG TATATTTGAT ATTCAAGCTG 4 74 0 

ATCATGATTT AAATATTATG CAAGATCAAC AAACATTAGC AGGCCTTACG GCGAATGCAC 4 BOO 

TTGCTAAACT TGATAGCATC ATTAATGAGG AACAACCGGA TATGATTTTA GTACATGGTG 4B6 0 

40 

ATACTACAAC GACTTTTGTA GGAAGTTTGG CAGCATTTTA TCATCAAATT CCGGTCGGAC 4 92 0 

ATGTAGAAGC TGGACTTCGA ACACATCAGA AATACTCACC ATTTCCTGAA GAGTTAAATC 4 98 0 

GAGTCATGGT AAGTAATATT GCTGAATTGA ATTTTGCGCC AACAGTAATT GCAGCTAAAA 504 0 

45 

ATTTACTTTT TGAAAACAAA GACAAAG AG C GTATCTTTAT TACTGGAAAT ACAGTTATTG 5100 

ACGCATTGTC AACAACAGTT CAAAATGATT TTGTTTCAAC GATTATTAAT AAACATAAAG 516 0 

50 GCAAGAAAGT TGTTTTACTA A CAG CG CATC GTCGTGAAAA TATTGGGGAA CCGATGCATC 52 20 

AGATTTTTAA AG CAG T AAG A GATTTGGCAG ATGAATATAA AGATGTTGTC TTCATTTATC 52 8 0 
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20 



GGATTGAATT AATTGAGCCA TTAGATGCGA TTGAGTTCCA TAATTTTACA AATCAATCGT 54 0 0 

ACCTCGTGCT GACAGATTCT GGTGGTATTC AAGAGGAGGC TCCTACATTT GGAAAACCTG 54 6 0 

TGTTGGTATT AAGGAATCAT ACAGAGCGTC CCGAAGGCGT TGAGGCGGGA ACATCGAGAG 552 0 

TAATTGGCAC AGATTATGAC AATATTGTTC GAAATGTGAA ACAATTGATT GAGGATGATG 5 5 80 

AAGCGTATCA ACGTATGAGT CAAGCGAATA ATCCATATGG TGATGGACAA GCATCACGAC 564 0 

GTATTTGTGA AGCAATAGAA TATTATTTTG GATTGCGCAC AGACAAGCCG GATGAATTCG 570 0 

TACCTTTACG TCACAAATAA TAAAAAACCC CTAATCATGA AGTTGGTTTA GACAACCAGC 57 6 0 

GGTGACTAGG GGTTTTTAAT ATATTTATTT TTGATAGTGG TAGCCAATAT CATATTTGAA 5 82 0 

TACTTTATTT GATAATATTG GACTTTGCTG TCCATCGTCA TCACTTTTTA AACGTACATT 58 80 

TTTATGAGCT TCTTTAAATA CATCGGAATT CAACCAATTA TTAAAGCTAT CTTCAGATTC 594 0 

CCAAATAGTT AAGATTTTAA CTTCGTCTGT AT C CTCGG T A TTTAATGTTT TAGTGACAAA 6 000 

CATTTGTTGG AAGCCTTCAA TAGTTTCAAT ACCTTGTCTA TTGTAAAAAC GTTCAATCGT 606 0 

TTCTTCCGCA CTGCCTTTTT GTAATTGTAA TCTATTTTCT GCCATAAACA TGGGCAATCA 6120 

CTCCTCTATT TTATGATTTG ATTTGGGTAA TGTTTTTACA AATGTAAAGA GTACAGCGGT 618 0 

TTGTATGATA ACCATTATGA TTAATCCTAC ACGGACTGCA AGAACATCCA CCATATAAAT 624 0 

TGAAAAACCT ATTACAATGT ATAAGCTAAT TAAAATTTTA ATTTTCTGTT GTAGCGTGTA 6 3 00 

GCCTCGATGT AAATAAAAGT TTTCTACATA TTCTTTATAA ATTTTTTGAT TAATAAGCCA 636 0 

ATTGTAAAAG CGATCTGAAC TTCGAGCAAA GCAAAAAACT GCTACGAGTA AAAAAGGGGT 64 20 

CGTTGGCAGT AAAGGTAATA CGGCACCTGC AATACCAAGC GCTGTAAATA TTAAGCCAAT 64 8 0 

GACGATTAAA ATAAGTCGCA TTGAAAAAAC TCCATTCTAG TACTAATGCG CATGTAATAT 6 54 0 

TGTTTTAGTA ATATAACTCA TGCTAAATAT AATGTGTATG ATAAGTGCAA TGACTCAGTA 660 0 

AAATGAAACG ATGTTGAATT ATCCTTGTCA CATTAACGCA TTTTAAGCGC GACTTTCATA 6660 

ACAACCAAAC TATTTAATGA GAATTATTCT CAAGTATTAT AGTTATATTA TGTGTTTTAT 672 0 

TTTTGAAAAG TGCAATATGT TTTCGAAAAT AAGATTATTT TTATGTGCAA AAACGACGCA 6780 

AAAGTTTTAA AAATGAGACT TCTGTGAGCT GATTATTTTA TAAAATGTAA ACGCTTACTA 684 0 

TATAATGTGA ATCATATCGT TTAAAAGCAT TATTAAATAT GATGCTAAGA GATTTATATT 6900 

ATAGCCAATA AACAAAGGAG AGATAATATG GCAGTAAACG TTCGAGATTA TATTGCAGAG 696 0 

AATTATGGTT TATTTATCAA TGGGG AATTT GTTAAAGGTA GCAGTGACGA AACAATCGAA 702 0 

GTGACTAATC CAGCAACTGG AGAAACACTA TCACATATTA CAAGAGCAAA AGATAAAGAT 7 08 0 
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TCAGAACGTG CACAAAT3TT GCGTGATATT GGTGATAAAT TAATGGCACA AAAAGATAAA 72 0 0 

ATTGCAATGA TTGAAACATT AAATAATGGT AAACCGATTC GTGAGACAAC AG CAATTGAT 72 6 0 

* ATTCCATTTG CTGCAAGACA TTTCCATTAT TTCGCAAGTG TTATTGAAAC AGAAGAAGGT 732 0 

ACAGTGAATG ATATCGATAA AGACACAATG AGTATCGTAC GACATGAGCC GATTGGCGTC 7 3 30 

GTAGGTGCTG TTGTTGCTTG GAACTTCCCA ATGCTATTAG CTGCATGGAA GA7TGCGCCA 744 0 

w 

gCCATTGCTG CAGGTAATAC AATTGTGATT CAACCTTCGT CTTCAACACC ATTAAGTTTA 7 50 0 

TTGGAAGTTG CTAAAATTTT CCAAGAGGTA TTACCTAAAG GTGTTGTCAA TATACTAACG 7 560 

GGTAAAGGTT CAGAATCA GG TAATGCAATT TTCAATCATG ATGGTGTAGA TAAATTATCA 7620 

15 

TTTACGGGCT CAACTGATGT AGGTTATCAA GTTGCCGAAG CTGCAGCAAA ACATCTAGTA 76 8 0 

CCCGCTACAT TAGAGCTTGG TGGTAAAAGC GCCAATATCA TATTAGATGA TGCTAATTTA 774 0 

20 GACCTTGCAG TTGAAGGTAT TCAGTTAGGT ATTTTATTCA ACCAAGGTGA AGTATGTAGT 780 0 

GCAGGTTCTC GATTATTAGT TCATGAAAAA ATTTATGATC AATTGGTGCC ACGTTTACAA 7860 

GAGGCATTTT CAAATATTAA AGTTGGAAAT CCACAAGATG AAGCTACACA AATGGGTAGT 7920 

25 CAAACTGGTA AGGATCAATT AGATAAAATT CAATCATATA TTGATGCAGC AAAAGAATCA 7980 

GATGCACAAA TTTT AG CAGG CGGTCATCGC TTAACTGAAA ATGGATTAGA TAAAGGGTTC 8 04 0 

TTCTTTGAGC CGACATTAAT TGctGTGCCA GACAATCATC ACAAATTAGC ACAAGAAGAA 8100 

30 ATATTTGGAC CAGTGTTAAC AGTGATTAAA GTGAAGGACG ATCAAGAAGC AATTGATATA 816 0 

G CTAATGATT CTGAGTATGG TTTAGCAGGC GGTGTATTTT CTCAAAATAT CACACGTGCA 822 0 

TTAAATATTG CTAAAGCTGT ACGTACAGGA CGTATTTGGA TTAACACTTA CAACCAAGTA 82 80 

35 

CCAGAAGGCG CACCATTTGG TGGTTATAAA AAATCAGGTA TCGGTCGAGA AACTTATAAA 834 0 

GGTGCGTTAA GTAACTATCA ACAAGTTAAA AATATTTATA TTGATACAAG CAATGCTTTA 84 0 0 

AAAGGTTTGT ACTAGAATAA ATATCGTTTC TGAAGCGTGT TTGTAGGTCA GTCTAGCGGT 84 6 0 

40 

AAGTCTTAAC ATTTAACGGC GTTGTTTAGA TTTTAAGCAA AACAAAATAT ATAGGAACAC 8 52 0 

GTATCATGAT ATTAGGATAT AATGACTAAA ATAATAGCAG TAGGATGGTT TTTAATTGCA 8580 

AATCATCTTA CTGCTGTTTT TAATTATGCT AATTTGCGAT GCGGCTATTA TAAGGACAGA 864 0 

45 

GTTGTTTATT AATTATGGTG ATTTAGAAAT ATGAAGTTCA A7ATGCAAAG TCATCGTTTG 87 0 0 

TTTTAATATG CGGAACAATC ATTAAAGTTA TTGCGATTTT TTGAACTTAA TG AAA CT AAA 8 76 0 

50 CAATAAATTT GAGATACTTT TTTGTCATTT TTATGTAACT AACACAATAA TCTCGTACAT 8 320 

TATTAAAATT TTCTATATGA TAGGAATAAA GCAAAGCGCG AGTGTGCTGT AAAAGTTTTC 8 8 80 
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GATGATGTAT AAATCATGGT TAATTACGGA 
GAATTATTTT TAAAAGCGAC AATATTAAAT 
ATGAATGGGA AAAAGGCGAA TACGATAAAC 
CAAAAAATTC AACAAAGTTC TAAAAAGACG 
TTTACAG TG A TTGAATTTGT CGGAGGTTTA 

10 

TCATTTCATA TGCTTAGTGA TGTATTAGCA 
GCAAGTAAAA AGCCGACTGC ACGATACACA 
GCATTTTTAA ATGGTTTAGC ATTAATTGTA 

15 

GTACGTATTA TTTATCCGCA ACCAATTGAA 
GGTTTACTCG TCAATATTAT TTTGACTGTT 

20 AATATCAATA TTCAAAGTGC ATTATGGCAT 

GTCATCGTTG CAGTTGTATT GATTTACTTT 
AGTATTGTAA TTTCACTCAT CATTTTACGT 

^5 tTAATTTTAA TGGAAAGTGT GCCTCAACAT 

AAAAACATAG A TGG CAT ATT AGATGTACAT 
CATTATTCAT TAAGTGCCCA TGTTGTGTTA 

30 GCGATTGATC AAGTAT CATC ATTGTTGAAA 

CAAATTGAAA ACTTGCAATT GAATCCATTA 
ATAAAACATT GTAGCGCCTA AAA CATT AAT 

35 CTTATGTTGC ATCATTTAAA TGATTTTCGT 

CGACATCTTT AGGTTT CAAA ATATGAATAT 
CTATGATGTA CCTTTGACCG GCCATTGTTT 

40 

TTGCTACGAC AGATTCTTTA TCCATAATGA 
TACCCTAACA TGATTTTTAT ACTCTTTGAA 
TTAAAAAAAT ATCTTAATAT CCTTGTAATC 

45 

CATtGTTATA GGAGGTCTTA TTAATGACAT 
TTG CAT CAAC GAAAGAAGAA CTAGAAGCAA 
CAACATTAAT TGAAGTACAA GCTACTGAAA 

SO 

CAAATGACGA aGCAGAAGCT AAACAATTTT 



AG CATT AAT A TTAACCTGAG AAGCTATAAA 90 00 

ACGACGCATT TATTTAGGAG TGGCAAACGT 9 06 0 

AG AT A CAAAT ATTTTCAT C A TGTCAATCAT 9120 

CTGTGGGCAT CACTAATCAT CACATTG TT A 9 ISO 

GTATCTAATt CATTGGCATT ACTGTCAGAT 924 0 

CTTGGTTTAT CTATGTTGGC CATTTATTTT 93 00 

TTTGGATATT TAAGATTTGA GATATTAGCT 93 60 

ATTT CAATCT GGATTTTATA TGAAGCTATT 94 20 

AGTGGCATTA TGTTTATGAT TGCTAGTATT 94 80 

ATCCTTGTAA GGTCTTTAAA ACAAGAAGAC 954 0 

TTCATGGGAG ACTTATTGAA CTCTATTGGT 96 0 0 

ACAGGATGGC GCATCATCGA CCCAATCATT 96 60 

GGTGGTTATA AAATTACGCG TAATGCgTGG 972 0 

TTGGATACTG ATCAAATTAT GGCAGATATT 97 8 0 

GAATTTCATT TGTGGAGTAT TACAACAGAG 984 0 

GATAAAAAAT ATGAGGGTGA TGATTATCAA 990 0 

GAAAAATATG GCATTGCACA TTCAACGTTG 9 96 0 

GATGAGCCAT ACTTCGACAA ATTAACATAA 10020 

CTATGTCATA GGCGCACGTT TCGTTTTATA 1008 0 

CAATTTCTTT GATGCTATCT ACATCTAACA 1014 0 

GTTTTTCATC ATTTGTATGT AAAATG CGTT 10200 

CTACAGCAAT CTTTTTGTTT CTAGCTAAAC 10260 

TAGCCCCCTA TATATATGTT TATTTACTTA 10320 

AATATATTTT ACAGAATTTT ATCTAAATAT 10380 

CGATAAGAAT TATAGTAATA TTTTTTCAAC 10440 

TATTTTTATT AGAAGCTAAC AATCTTGATT 1050 0 

AGG CAG CAT C ACTATCTACG AAGACAATTC 1056 0 

ATTTAACTCA TGGTTATTTT ATTGTGGAAG 10620 

TAACAGAAGC AG AT ATT AG T ATTCAATTAG 10680 
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TTGA7TACCT TGTAAC7TGG AACATTCCGG AAGGCATTAC GATGGATCAA TATTTAGCAC 10 3 00 

GTAAAAAGAA AAATTCTGTT CATTATGAAG AAGTGCCAGA AGTTGAATTT AAACGCACAT 10860 

5 

ATGTATGTGA AGATATGTCT AAATGTATTT GTTTATACAA CGCACCTGAT GAAGAAGCGG 10 920 

TACGTCGCGC GCGCAAAGCA GTTGATACAC CGATTGATGG CATCGAAAAA CTTTAATAAG 10930 

ACAACAAGTT GATGAGATAT ATGTATATAG GTTTGGCATG GATTTCGATT GCAGTTAATT 11040 

10 

AGAATAGCTC AATGCTATAA ATGTAAGTAG TTGATATGAA GAAACTAATG AACTAAATGC 11100 

AAGTATTGTC TAAAACAATC ATTTTATTGA AATTTAGTAG AGCTGAAATT AATATAACGT 11160 

CGTTAATTGA ATAACGCTTA TGTTATAAGA GCACTCATAC CAAACCATAA TCATCTATAG 11220 

15 

ATATAACAAT TCACGATATA AGGGCTGTGT TTGGCATAGC CCTTTAGATA TACACTTAAT 112 80 

T C CT ATTAAA ATAGTAGGGA TTAAAAGGGG GCTTGTCATG ATTAAAATTC AACAATTACA 11340 

20 ACATCACTTT GG AT CACAT A AAGTAATTCA TAACTTTAAT TTGGACATTA GCAAGGGAGA 114 0 0 

AATAGTCACT TTCATAGGGA AAAGTGGTTG CGGAAAGTCT ACTTTACTCA ATATTATCGG 114 6 0 

TGGATTTATT CATC CAT CGT CTGGTCGTGT CATTATTGAT AACGAAATTA AACAACAGCC 11520 

25 ATCTCCAGAT TGTTTAATGC TATTTCAACA TCATAATTTG CTGCCATGGA AAACGATTAA 11580 

TGACAACATT AGGATTGGAT TACAACAGAA AATTAGTGAT GAAGAGATTA ACGCACAGCT 1164 0 

TAAATTAGTT GATTTAGAAG A CAGGGG AAA GCATTTTCCC GAGCAACTGT CCGGGGGTAT 11700 

30 GAAACAACGT GTGGCACTAT GTCGAGCGCA TGTGCATAAG CCTAACGTTA TATTGATGGA 11760 

TGAGCCATTA GGTGCATTAG ATGCATTTAC ACGTTATAAA CTTCAGGATC AACTAGTGCA 11820 

aCTAAAACAT AAAACGCAAT CAACTATTAT TTTAGTGACG CATGACATTG ATGAAGCTAT 118 BO 

35 TTATCTTTCC GACCGCATTG TTCTGTTAGG TGAAGGGTGC AATATTATTT CTCAATATGA 11940 

AATTACAGCA TCACATCCAC GCAGTCGTAA TGATAGCCAC CTACTTAAGA TTCGTAATGA 12000 

AATTATGGAA ACATTTGCAT TGAATCATCA TCAAGTTGAA CCTGAATATT ATTTATAAGG 12060 

40 

AGTGAGTGAC GATGAAAAGG TTAAGCATAA TCGTCATCAT TGGAATCTTT ATAATTACAG 12120 

GATGTGATTG GCAAAGGACG TCTAAAGAAC GGTCTAAAAA TGCCCAAAAT CAG CAAGTG A 12180 

TTAAAATTGG ATATTTGCCG ATTACACATT CAGCTAATTT GATGATGACT AAAAAATTAT 12240 

45 

TATCACAATA CAATCATCCG AAATATAAAC TAGAATTAGT TAAATTCAAT AATTGGCCAG 12300 

ATTTAATGGA CGCATTAAAC AGTGGTCGTA TTGATGGTGC ATCAACTTTA ATAGAGCTAG 12 360 

so CGATGAAATC AAAACAGAAG GGCTCAAATA TAAAGGCTGT GGCATTGGGC CATCATGAAG 12420 

GCAATGTCAT TATGGGACAA AAAGGTATGC ACTTAAATGA ATTTAATAAT AATGGCGATG 12480 
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GTAAACAATT AAAGATTAAA CCGGGGCATT TTAGCTATCA TGAAATGTCG CCAGCAGAAA 126 00 

TGCCAGCCGC ATTGAGTGAA CACAGAATTA CAGGGTATTC TGTAGCCGAA CCATTCGGTG 12660 

* CACTGGGTGA AAAGTT AGG C AAAGGTAAGA CTTTGAAACA TGGTGATGAC GTTATACCTG 1272 0 

ATGCGTATTG CTGTGTGCTA GTACTGAGAG GGGAATTGCT TGATCAACAC AAGGATGTAG 12780 

CGCAAgCATT TGTACAAGAT T AT AAAAAG T CTGGCTTTAA AATGAATGAT CGCAAGCAAA 12840 

w 

GTGTAGACAT TATGACGCAT CATTTTAAAC AAAGTCGTGA CGTTTTAACA CAGTCAGCGG 1290 0 

CATGGACATC CTATGGTGAT TTAACAATTA AGCCATCCGG CTATCAAGAA ATTACGACAT 12960 

TGGTAAAACA ACATCATTTG TTTAATCCAC CTGCATATGA TGACTTTGTT GAACCGTCAT 13020 

TGTATAAGGA GGCATCGCGT TCATGACACG TCCCACAAAT AACAAATTTA TATTACCTAT 13080 

TATCACATTT ATTATTTTCT TAGGCATTTG GGAAATGGTC ATTATTATTG GGCATTACCA 1314 0 

ACCTGTATTG TTACCGGGTC CTGCTCTTGT AGG AAAAAG T ATATGGTCTT TCATTGTTAC 132 00 

20 

TGGAGAAATT TTCCAACATT TAGCAATTAG TTTATGGAGA TTTGTAGCGG GCTTTGTTGT 13 260 

CGCATTGTTG GTTGCTATTC CATTGGGCTT CTTGCTTGGA AGGAATCGTT GGCTATACAA 1332 0 

25 CGCTATCGAA CCGCTATTTC AATTGATTAG GCCGATATCT CCGATAGCAT GGGCACCATT 133 80 

TGTTGTTCTA TGGTTTGGTA TTGGTAGTTT GCCAGCGATT GCGATTATTT TTATCGCTGC 1344 0 

TTTTTTCCCA ATTGTGTTCA ATACTATTAA AGGCGTTAGA GACATTGAAC CTCAATATTT 13 500 

30 AAAAATAGCA GCAAATTTAA ATTTAACTGG GTGGTCATTG TATCGCAATA TATTATTTCC 13 560 

CGGGGCATTT AAACAAATCA TGGCTGGGAT ACATATGGCG GTAGGAACAA GTTGGATATT 13620 

TTTAGTTTCT GGTGAAATGA TTGGTGCACA ATCGGGATTA GGTTTTTTAA TCGTTGATGC 136 80 

35 ACGAAATATG TTGAACTTAG AAGATGTTTT AGCAGCAATA TTCTTTATCG GATTATTTGG 13 740 

TTTTATTATT GATCGATTCA TTAGTTATAT TGAGCAGTTT ATACTTAGAA GATTTGGTGA 13 8 00 

ATAAGGAGAG ATGATGATGA CTTTAGAAAC GCTTATCAAA GAACAATTAG ATCCTCATTT 13860 

40 AGTAGAAGTT GATGAAGGGA CGTATTATCC GAGAACATTT ATTCAGCAAT TATTTGTAGA 13 92 0 

TGGTTATTTC GGTGAGGCGG CATTGAGAAA AAATGCTGAA GTAATCGAAG CTGTATCGCA 13 980 

GTCTTGTTTG ACAACAGGAT TTTGTTTATG GTGCCAATTA GCTTTTTCAA CGTATTTAGA 14 040 

45 

AAATGCCACG CAGCCACATT TAAATAATGA CTTACAACAG CAATTGTTAT CTGGAGAAAT 14100 

ATTAGGTGCT ACCGGATTGT CTAATCCGAT GAAGTCATTT AATGATTTAG AAAAGTTGAA 1416 0 

CCTTGAACAC ACTTATGTTG ATGGACAATT GGTTGTCAGT GGACGTATGC CAGCTGTAAG 14220 

50 

TAATATTCAA GAAGACCATT ATTTT GGTGC GATTTCGAAA CATGAATCAT CAGATGAATT 142 8 0 
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TT7AGGAGTC AACGGGTCAG CAACGTATCA AATCACATTG AATCAAGTCG TAGTGCCACA 14400 

AT CAC AAATT ATCACGCATG ATGCGAAGCA GTTTGCGGCA ACTATTCGCC CGCAATTTAT 14460 

TGCTTACCAA A7TCCAA7AG GATTAGGCTC AATTAAAAGT TCTTTAGAGT TAATTGA7GC 14 520 

AT7TTCAAAT GTGCAAAACG GAATAAATCA ATATTTAGAG TATGATGTTG AAGCTTTTAA 14 580 

AAAACGTTAT CGTCAACTTA GAGAGGAA7A TTATGCAATA TTAGATGACG GTAACTTAAC 14 640 

TTCACATTTA AATGAA77AA TATCATTGAA GAAGGACATC GGCTATTTAT TGT7AGATGT 14 700 

AAATCAAGCT TCTGTTGTCA ATGGTGGTTC TAGAGCGTAC ACACCATATT CGCCACAAGT 14 760 

TCGCAAGTTA AAAGAAGGAT TCTTCTTCGC AGCATTGACA CCGACATTAA GACATTTAGG 14 820 

15 

TAAACTTGAA GCAGAGTTGA AGGGGTAAGT GTGATAAGCT GATTTTTTGT TTAGATGCGT 14 880 

TTGTTGAAAC ATTTTTTAAA ATAATATAAA TCTTAGTTTA TAAACATTTT CTGTTAATTT 14 940 

20 GTTATATCCT TTTAACTAGG AAAATATACA TTTCGTAATA AT AATAAT CG TTATCATTGA 15000 

AAAAGTGTTA ATAAGGTGTA TAATGAAAAT GTGAACAATT AATGAACTTC TTATTTTAAA 15 060 

GAAGGTGAAT ACTATAGATA CGCATACTAA AGAACAACAA TTCTCGAATC TAGTAAGATC 15120 

25 TTATCGTAAA GAATACGTGG GTAAAGGACC CAATAGTATT CGAGTGTCGT TTAAAGATAA 15180 

TTGGGCGATT GCACATATGA CAGGTGTTTT GAGTAAAGTT GAGAGTTTTT AC CTAAACG A 1524 0 

CAAACGCAAT GAATCGATGC TCCATTATAC ACGCACAGAG AAGATTAAAC AGATGTATAA 153 00 

30 AGAAATAGAT GTAAATGAGA TGGAAAGTCT TGTAGGCGCT AAGTTTGTAA AATTATTTAC 153 60 

AGATATTGAT TTGAATGATG ATG AAG T CAT TTCAATATTT GTTTT CG AT A AGTCAATAGA 15420 

ATAAGTGTTG CTGGTGTAAG GTACACGGTG CTGTTTG CT A ACTTCGCTTT GAATTTAACA 15480 

35 ATAATTCAAG GGGGTGG TAT GTCAAACGGT GCCGTTTTTT TGT CATATTT TTAAAACAAG 15540 

CAA^ATGCAA CACGTACTTT AAGGAAGTCA AAATTTATCA TTTAGGAGAG ATGGATATGA 156 00 

AAATCGTAGC ATTATTTCCA GAAGCAGTAG AAGGTCAAGA AAATCAATTA CTTAATACTA 156 60 

40 

AAAAAGCATT AGGATTAAAA ACATTTTTAG AGGAAAGAGG ACATGAGTTC ATTATATTAG 15720 

CAGATAATGG TGAAGACTTA GATAAACATT TACCAGATAT GGATGTGATT ATTAGTGCGC 157 BO 

CATTTTATCC TGCATATATG ACTCGTGAAC GTATTGAAAA AGCACCGAAC TTG AAATT AG 15840 

CAATTACAGC AGGTGTAGGA TCTGACCATG TAGATTTAGC GGCAGCAAGT GAACACAATA 15900 

TTGGTGTCGT TGAAGTTACA GGAAGTAATA CAGTTAGTGT GGCAGAACAT GCGGTTATGG 15960 

ATTTATTAAT ACTTCTTAGA AACTATGAAG AAGGTCATCG TCAATCAGTA GAAGGTGAAT 1602 0 

GGAACTTGTC TCAAGTAGGT AATCATGCGC ATGAATTACA ACACAAAACA ATTGGTATTT 16080 
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20 



25 



30 



40 



TACAACACTA TGATCCAATC AATCAACAAG ACCATAAATT GTCTAAATTT GTAAGCTTTG 162 0 0 

ATGAACTTGT TTCAACAAGT GATGCGATTA CAATTCATGC ACCATTAACA CCAGAAACTG 1626 0 

ATAACTTATT TGATAAAGAT GTTTTAAGTC GTATGAAAAA ACACAGTTAT TTAGTGAATA 16320 

CTGCACGTGG TAAAATTGTA AATCGCGATG CGTTAGTTGA AGCGTTAgCA TCCGAGCATT 16380 

TACAAGGATA TGCTGGTGAT GTTTGGTATC CaCAACCtGC ACCTGCTGAT CATCCATGGA 16440 

GAACAATGCC TAGAAATGCT ATGACGGTTC ACTATTCAGG TATGACTTTA GAAGCACAAA 1650 0 

AACGTATTGA AGATGGAGTT AAAGATATTT TAGAGCGTTT CTTCAATCAT GAACCTTTCC 16560 

AAGATAAAGA TATTATTGTT GCAAGTGGTC GTATTGCTAG TAAAAGTTAT ACAGCTAAAT 16620 

AGAATAAGGA TGCTGGGCTA GCGATTAACG CTTTCAATTT TATATAAATG AATCATATAA 16 680 

GCACTACTGC TGTTGTAAAG ATGGCAGTAG TTTTTTTATG ATTACATCTA AGTATAGTCA 16740 

CGGCTATGTT AGGACAATGA TTTAACATTT ACGCACATAT GTGTTCACTT AC G CAATT AT 16 300 

TGAnAAATnT CATTCATGTG GnAATC 16 926 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4012 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi> SEQUENCE DESCRIPTION : SEQ ID NO: 47: 

TTCAATGAGA GTAGTGGGCT GATGTTTAGC GATATCGCGT AAGATTAACC ATTGGCCATA 6 0 

35 ATATATATTG TGTTTTTCTA AAATCGGCTC GGCTAATTTT AAATAGGGGC GATATATTGT 12 0 

TATMAACTA TTGAAAAATT CTTGTGATAG CATAGTGACA TCTCCTAAGA CAAAATAGTT 18 0 

AGCTTAGCTA mCCTTTITAC AACAATAGTA ATTATAAAAC GGG AG CAATT AGAAAT CAAT 24 0 

ATATAATTAT TAAGAGCAAA AATAATTATA CTTTGTTAAA ATAAGCGTAA TTACATGTAA 300 

ATAGGGGGAT ACTAATGATA TTGAAATTTG aTCACATCAT TCATTATATA GATCAGTTAG 360 

ATCGGTTTAG TTTTCCAGGA GATGTTATAA AATTACATTC AGGTGGG TAT CATCATAAAT 420 

ATGGAACATT CAATAAATTA GGTTATATCA ATGAAAATTA TATTGAGCTA CTAGATGTAG 480 

AAAATAATGA AAAGTTGAAA AAGATGGCAA AAAC GAT AG A mGGCGGAGTC GCTTTTGCTA 54 0 

CTCAAATTGT TCAAGAGAAG TATGAGCAAG GCTTTAAAAA TATTTGTTTG CGTACAAATG 600 

ATATAGAGGC AGTTAAAAAT AAACTACAAA GTGAGCAGGT TGAAGTAGTA GGGCCGATTC 66 0 
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ATCAGGATGA TGATGAAATT AAGCCACCAT 
TGCGTACTAA AAAATTGCAA AAATATTTTC 

5 TGAAAAGTAA AAACCGATCA C AAA C AG TAT 

TTGTAGAAGA GAATGACCAT TACACAGATT 
GAATTGAAGA TGGTAAAGTT TCAAAATATC 

W CTTCACCATA TTCAATTTTT ATCAGAGGTG 

TATACGTAAG TGCTATGAGC GAGAATGCCC 
ATCGTTAATA TATTATTTAA TCGTGATGAC 

15 

AATGTGAAAA AGATAAGTAT AACCCGTAAA 
AATGTCATAA TGATTGCAAC GATGTTCATA 
ATATCAAACA CCTCATTGTT AGATTATTGA 

20 

TTAATGTGGT TGCTTGAGGA AAAATTTATT 
TGAATATCGT GTTAGATGAT GAAAGTATAT 

, ?5 AATTGTACGA TAACATTAAA TTTAACACGA 

ATGGGTAAAT TTGAACTTGC TAAACTATTA 
TTCAAATCTT ACACAAGCTC TGAATCGACA 

30 ATTGTTAAAT AGAAGGAGAT ATCATAAATC 

CATGACG CAT TTGTTAAATC CCACCCAAAT 
GAAACAAAGA AATTAACTGG ATGGTACGCG 

35 GTTCAGGGTG TTGCGCAGTT ACTTTTTAAA 

TATATTTCGC GTGGTTTTGT TGTTGATTAT 
GACAGTGCAA AAGAAATTGC TAAAGCTGAG 

40 GTTGAAGTTG ATAAAGGTAC AGATGCTTTG 

AAAGGATTTA AAGAAGGTTT ATCAAAAGAC 
C CAATTGAT A AAAATGATGA TGAGTTATTA 

45 

GTGCGCTTGG CTTTAAAGCG AGGTACGACA 
ACATTTGCTG AGTTAATGAA AATCACTGGG 
AGTTACTTTG AAAATATTTA TGATGCGTTG 

50 

GTAAAGTTGG ATCCAAAAGA AAATATAGCG 



TT7TTATTCA ATGGGAAGAA AG TG ATT CCA 78 0 

AAAAACAATT TTCAATTGAA A CTGTT ATTG 84 0 

CGAATTGGTT GAAATGGTTT GATATGGACA 90 0 

TGATTTTAAA AAATGATGAT ATTTATTTTA 96 0 

ATTCGGTTAT CATAAAAGAC GCACAAGCAA 102 0 

CTATTTATCG CTTTGAACCA TTAGTATAAA 1080 

ATATGAATAA TGACAAGCAC AATGGAAAGA 114 0 

TTAATTAAAA TGAAAAAGAT TGATAATATA 1200 

CTAAAGTAAT TCACGGTGAG AGGTTGACTC 126 0 

ATTATAAATA GACTTAAAAT AATTGTTCTC 13 2 0 

CATTATAACA GGGGTAATTG TATATGAACA 13 80 

CATTGAAGTC AAGTTGGTTC ATTTTAGAAA 1440 

TGAAGTATAG GTAACTAGTT GAAAAGTATT 15 00 

AACATAGATA TAAAATGATT CACAATTAAA 156 0 

ATT GG AG CAT GGACATTTCA AAAATAAGAG 162 0 

CTATAAGATA CAAACTGTAT AATTAAAGGT 16 80 

ATGGAAAAGA TGCATATCAC TAATCAGGAA 174 0 

GGAGATTTAT TACAATTAAC GAAATGGGCA 18 00 

CGAAGAATCG CTGTAGGTCG TGACGGTGAA 186 0 

AAAGTACCTA AATTACCTTA TACGCTATGT 192 0 

AGTAATAAAG AAGCGTTAAA TGCATTGTTA 1980 

AAAGCGTATG CAATTAAAAT CGATCCTGAT 2 04 0 

CAAAATTTGA AAGCGCTTGG TTTTAAACAT 2100 

TACATCCAAC CACGTATGAC TATGATTACA 216 0 

AATAGTTTTG AACGC CG AAA TCGTTCAAAA 2220 

GTAGAACGAT CTGATAGAGA AGGTTTAAAA 22 80 

GAACGCGATG GCTTCTTAAC GCGTGATATT 2 3 40 

CATGAAGATG GAGATGCTGA ACTATTTTTA 24 00 

AAAGTAAATC AAGAATTGAA TGAACTTCAT 24 60 
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CAAAATA7GA TTAATGATGC GCAAAATAAA ATTGCTAAAA ATGAAGATTT AAAACGAGAC 25 9 0 

CTAGAAGCTT TAGAAAAGGA ACATCCTGAA GGTATTTATC TTTCTGGTGC ACTATTAATG 264 0 

5 

TTTGCTGGCT CAAAATCATA TTACTTATA7 GGTGCGTCTT CTAATGAATT TAGAGATTTT 270 0 

TTACCAAATC ATCATATGCA GTATACGATG ATGAAGTATG CACGTGAACA TGGTGCAACA 27 6 0 

ACTTACGATT TCGGTGGTAC AGATAATGAT CCAGATAAAG ACTCAGAACA TTATGGATTA 2 32 0 

10 

TGGGCATTTA AAAAAGTGTG GGGAACATAC TTAAGTGAAA AGATTGGTGA ATTTGATTAT 28 3 0 

GTATTGAATC AGCCATTGTA CCAATTAATT GAGCAAGTTA AACCGCGTTT AACAAAAGCT 294 0 

AAAATTAAAA TATCTCGTAA ATTAAAACGA AAATAGATTA A CG A CTG AAA TCTGAACGCT 3 000 

CATAAGACTG TCATTTGCGT TCAGATTTTT TTACACAATA TAGAATGGTT GAGTAAAATA 30 60 

TTTTTGAATA TAGTGAAAGA GGGGGAAGTA CTGTGATAAA AAAGCTATTA CAATTTTCTT 3120 

0Q TAGGGAATAA GTTTGCTATC TTTTTAATGG TTGTTTTAGT TGTCTTGGGC GGTGTATATG 3180 

CGAGTGCTAA ATTGAAATTA GAATTACTAC CAAATGTACA AAATCCAGTT ATTTCAGTTA 324 0 

CAACAACAAT GCCGGGTGCA ACGCCACAAA GTACCCAAGA TGAAATAAGT AGTAAAATTG 33 0 0 

25 ACAATCAAGT AAGAT CATTG GCATATGTGA AAAATGTTAA AACGCAATCC ATACAAAATG 33 6 0 

CTTCAATTGT AACAGTTGAA TATGAAAATA ATACAGATAT GGATAAAGCA GAAGAACAGC 342 0 

TTAAAAAAGA AATCGATAAA ATTAAATTTA AAGATGAAGT TGGTCAACCA GAATTAAGAC 348 0 

30 GTAATTCGAT GGATGCTTTT CCGGTTTTAG CATATTCATT TTCAAATAAA GAGAATGACT 3 54 0 

TGAAAAAAGT AACGAAAGTA CTGAATGAAC AATTAATACC AAAATTGCAA ACGGTAGATG 36 0 0 

GTGTGCAAAA TGCGCAATTA AATGGGCAGA CGAACCGTGA AATCACCCTT AAATTTAAGC 36 6 0 

35 AAAATGAACT TGAAAAATAT GGGTTGACTG CTGATGATGT AGAAAACTAT CTAAAAACGG 372 0 

CAACAAGAAC AACGCCACTT GGATTGTTCC AATTTGGTGA TAAAGATAAT CAATTGTTGT 37 8 0 

TGATGGTCAA TATCAATCTG TTGATGCTTT TAAAAACATA AATATTCCAT TAACGTGGCA 384 0 

40 

GGAGGACCAA GGGCATCTCA TCCCAAAGTG ACCATAAACC AAATTCAGCC ATGTCAGACG 39 00 

TTATCAGGCA TCACCACAGC AAATTCAAAG CGTCAGCnCC AATATATAGT GGATGCCGCA 396 0 

nGAACTAGGG GTTTAGCGnT ATCAGTGGTG TGGCGACTCT ATTCTAAACG AT 4012 

45 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7778 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

<D) TOPOLOGY : linear 
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(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

CAATATAGGT CGCCGAGTTT CAACTaCATC AACTGGTTCA GTTACATTAG ATAATGCGCT 60 

AGGTGTAGGT GGCTATCCTA AAGGACGAAT TATTGAAATT TATGGTCCTG AAAGTTCTGG 120 

TAAGACAACA GTAGCGCTTC ACGCTATTGC TGAAGTACAA AGTAATGGCG GGGTGGCAGC 180 

ATTTATCGAT GCTGAACATG CTTTAGATCC AGAATATGCT CAAGCAT7AG GCGTAGATAT 24 0 

10 

CGATAATTTA TATTTATCGC AACCGGATCA TGGTGAACAA GGTCTTGAAA TCGCCGAAGC 3 00 

ATTTGTTAGA AGTGGTGCAG TTGATATTGT AGTTGTAGAC TCAGTTGCTG CTTTAACACC 360 

TAAAGCTGAA ATTGAAGGAG AAATGGGAGA CACT CACGTT GGTTTACAAG CTCGTTTAAT 4 20 

15 

GTCACAAGCG TTACGTAAAC TTTCAGGTGC TATTTCTAAA TCAAATACAA CTGCTATTTT 4 80 

CATCAACCAA ATTCGTGAAA AAGTTGGTGT TATGTTCGGT AATCCAGAGA CTACACCAGG 54 0 

OQ TGGACGTGCA TTAAAATTCT ATAGTTCAGT AAGACTAGAA GTACGTCGTG CAGAACAGCT 600 

TAAACAAGGA CAAGAAATTG TAGGTAATAG AACTAAAATT AAAGTCGTTA AAAATAAAGT 66 0 

GGCACCACCA TTTAGAGTAG CTGAAGTTGA TATTATGTAT GGACAAGGTA TTTCTAAAGA 72 0 

25 GGGTGAACTT ATTGATTTAG GTGTTGAAAA CGACATCGTT GaTAAATCAG GAG CATGGTA 760 

TTCTTACAAT GGCGAACGAA TGGGTCAAGG TAAGGAAAAT GTTAAAATGT ACTTGAAAGA 84 0 

AAATCCACAA ATTAAAGAAG AAATTG AT CG TAAATTGAGA GAAAAATTAG GTATATCTGA 900 

30 TGGTGATGTT GAAGAAACAG AAGATGCACC AAAGTCATTA TTTGACGAAG AATAGTACAC 96 0 

AAATTTATAT CTATAGTTAA ACTTAGCAAA TATCCTTATA GGATTGATTG AAAGTGATAT 102 0 

TCATCTCATA AAGCTAGAAT AATATCTAAC TTTATGGGAT ACACTACAAA TCGAGACTAT 10 BO 

35 AAGGTTTTTT ATTTTATTTA TTATTACATT ATCAATAGTT TTATAATCGA GCTTCAAAAC 114 0 

TTTAGAAAAT AGTAGAAATA GCATTCAATA TAGTGCAAAA GTGCAAATTG ATAACTTGAC 120 0 

ACTTATCTCC TATAAACCGT ACAATTAATT TGTATGATTT ATATATAATT TCATAAAGTC 12 6 0 

40 

ATATTGAATT TCATATAAAG AGCAAACCCT AGAAAAGGAG GTGTTTGTGT GAATTTATTA 13 2 0 

AGCCTCCTAC TCATTTTGCT GGGGATCATT CTAGGAGTTG TTGGAGGGTA TGTTGTTGCC 13 8 0 

CGAAATTTGT TGCTTCAAAA GCAATCACAA GCTAGACAAA CTGCCGAAGA TATTGTAAAT 144 0 

45 

CAAGCACATA AAGAAGCTGA CAATATCAAA AAAGAGAAAT TACTTGAGGC AAAAGAAGAA 1500 

AACCAAATCC TAAGAGAACA AACTGAAGCA GAACTACGAG AAAGACGTAG CGAACTTCAA 156 0 

AGACAAGAAA CCCGACTTCT TCAAAAAGAA GAAAACTTAG AGCGCAAATC TGATCTATTA 1620 

50 

GATAAAAAAG ATGAGATTTT AGAGCAAAAA GAATCAAAAA TTGAAGAAAA ACAACAACAA 16 80 
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CGCATCTCCG GTCTCACTCA AGAAGAAGCT ATTAATGAGC AACTTCAAAG AGTAGAGGAA 130 0 

GAACTGTCAC AAGATATTGC AGTACTTGTT AAAGAAAAAG AAAAAGAAGC TAAAGAAAAA 13 60 

GTTGATAAAA CAGCAAAAGA ATTATTAGCT ACAGCAGTAC AAAGATTAGC AGCAGATCAC 1920 

ACAAGTGAA7 CAACGGTATC AGTAGTTAAC TTACCTAATG ATGAGATGAA AGGTCGAATC 13 80 

ATTGGACGAG AAGGACGAAA CATCCGCACA CTTGAAACTT TAACTGGCAT TGATTTAATT 2 04 0 

10 

ATTGATGACA CACCAGAAGC GGTTATATTA TCTGGTTTTG ATCCAATAAG AAGAGAAATT 2100 

CCTACAACAG CACTTGTTAA CTTAGTATCT GATGGACGTA TTCATCCAGG TAGAATTGAA 2160 

GATATGGTCG AAAAAGCTAG AAAAGAAGTA GACGATATTA TTAGAGAAGC AGGTGAACAA 2220 

GCTACATTTG AAGTGAACGC ACATAATATG CATC CTG ACT TAGTAAAAAT TGTAGGGCGT 2280 

TTAAACTATC GTACGAGTTA CGGTCAAAAT GTACTTAAAC ATTCAATTGA AGTTGCGCAT 2 34 0 

2Q CTTGCTAGTA TGTTAGCTGC TGAGCTAGGC GAAGATGAGA CATTAGCGAA ACGAGCTGGA 24 00 

CTTTTACATG ATGTTGGTAA AGCAATTGAT CATGAAGTAG AAGGTAGTCA TGTTGAAATC 24 60 

GGTGTAGAAT TAGCGAAAAA ATATGGTGAA AATGAAACAG TTATTAATGC AATCCATTCT 2 52 0 

25 CATCATGGTG ATGTTGAACC TACATCTATT ATATCTATCC TTGTTGCTGC TGCAGATGCA 25 8 0 

TTGTCTGCGG CTCGTCCAGG TGCAAGAAAA GAAACATTAG AGAATTATAT TCGTCGATTA 2 64 0 

GAACGTTTAG AAACGTTATC AGAAAGTTAT GATGGTGTAG AAAAAGCATT TGCGATTCAG 2 70 0 

30 G CAGG TAG AG AAATCCGAGT GATTGTATCT CCTGAAGAAA TTGATGATTT AAAATCTTAT 2 76 0 

CGATTGGCTA GAGATATTAA AAATCAGATT GAAGATGAAT TACAATATCC TGGTCATATC 2 320 

AAGGTGACAG TTGTTCGAGA GACTAGAGCA GTAGAATATG CGAAATAATT TTTGTCTCCC 2 3 80 

35 TCACAAATTA GTGAGGGAGC TTTTTTAAGT TGTAGTCTTA AC CTAGTTAG ACAGCACTTT 2 940 

ATCGGTAATA ACT AT ATT AA ACAGT AG TT A TTTGAAAGTA AGACGGACCT TATATTAAAT 3 3 00 

AAGAAGTTAT TGCTTTTAAT AAAAATGTTT TAGGCTTCGT AATTACTATA TTTATATTAT 3 0 60 

40 

GTAAACCTAT AAAGATGATT GGTTTTCTAT CCAATAAAAA AGAAGAGAAG ATGTAACACA 312 0 

TCTTCTCTTC yGCAATATTA ATTAGGATTT ATTTCTAAGT TGAGTTATTT TAATTGTAAA 3180 

TCTGTTTTCT TTAATTCTTT TATAACTTCT GCAGTATCAT AACAATTTGT TGCAATTGTT 324 0 

45 

GAATATCTCT CTGCTAAACG ATATGCATTA ATGTAAAGCT TTAAACTTTC TTTAGCTATA 33 00 

TCCTCTGCAT CTTCGAATTT TGATGGGTTA GACATAACCA CTAATTCTGC AAATTTTTCT 3 360 

5o GGATCAATAT TAATAGACAT GTATTTATTT ACAACTCCTA TTTATTTTGA TGTCTTAATA 34 20 

CTAACATATT GAAGTTTTCA GACAAAGTAA TGTCTCTCTA TAATTGAAGA AAAATAATTC 34 80 
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GG ATGAACAA AACATGAGAA TAATGTTTAT AGGGGATATC GTAGGTAAAA TTGGACGAGA 3 6 00 

CGCAATTGAA ACGTACATAC CTCAACTGAA GCAAAAG T AT AAACCAACAG TTACAATTGT 3 660 

AAATGCTGAA AATGCAGCAC ATGGTAAAGG TTTGACTGAA AAAATATATA AACAATTACT 3 72 0 

AAGAAATGGT GTAGATTTCA TGACTATGGG TAATCACACA TATGGTCAAC GTGAAATTTA 3 78 0 

TGATTTTATA GATGAAGCAA AACGACTAGT AAGACCAGCG AATTTTCCGG ATGAAGCGCC 3 34 0 

W 

GGGAATTGGT ATGAGATTTA TACAAATTAA TGATATTAAA CTTGCAGTTA TTAATCTGCA 3 90 0 

AGGAAGAGCG TTTATGCCAG ATATTGATGA TCCTTTTAAA AAGGCAGATC AATTAGTCAA 3 960 

GGAAGCACAA GAACAAACTC CGTTTATATT TGTTGATTTT CATGCAGAAA CAACTTCTGA 4 02 0 

JS 

AAAGTATGCA ATGGGATGGC ATTTAGATGG TAGAsTAGCG CTGTTGTTGG AACGCATACA 4 0 80 

CACATTCAAA CAGCAGATGA ACGTATTTTA CCAAAGGGGA CAGGG TAT AT AACGGATGTT 414 0 

GGTATGACAG GTTTTTATGA TGGCATTTTA GGAATAAATA AAACAGAGGT AATTGAGCGT 4 2 00 

20 

TTTATCACTA GTTTGCCACA AAGACATGTT GTTCCAAATG AAGGTAGAAG TGTATTATCT 4 26 0 

GGTGTTGTTA TTGATTTAGA CAAAGAAGGT AAAACAAAGC ACATCGAACG TATATTGATA 4 3 20 

25 AATGATGACC ATCCAT7TTC AACATTTTAA AATTACGTAA GTAAACATTC GAATTGGACC 43 8 0 

CTATCGTCCA TTAGTATGAA TTTAATATAG TACCACTGTT TACATAGTAA ATCGGTGGTT 44 4 0 

CTTTTTGTTA T CATTTAAT A TGAAATATAT CCATAGGAGG CATATAACTA TG AAA C CACA 4 500 

30 ATTATCGTGG AAAGTTGGCG GTCAACAAGG CGAAGGTATT GAATCAACTG GGGAAATCTT 4 560 

CGCTACGGCT ATGAATAGAA AAGGATATTA TTTATATGGA TATAGACATT TTTCAAGTCG 4 620 

TATCAAAGGT GGACATACGA ATAATAAAAT TAGAGTTTCT ACGACGCCTG TTCATGCAAT 4 6 80 

35 TAGTGATGAT TTAGATATTT TGATTGCATT TGACCAAGAA ACAATTGATG TTAAC CATCA 4 74 0 

TGAAATGAGA GAAGACAGTA TTATTTTArC TGATGCCAAG GCTAAACCTG TGAAaCCAGA 4 8 00 

AGGATGTCAT GCACAGCTTA TTGAATTACC TTTTACAGCA ACCGCTAAAG AATTAGGTAC 4 86 0 

40 

AGCATTAATG AAAAACATGG TTGCAATAGG TGCTACTAGC GCATTGATGA ATTTGAATAC 4 92 0 

AAATACATTT GAAGAACTTA TTACTAATAT GTTTTCTAAA AAAGGTGACA AGGTAGTTGA 4 98 0 

AGTCAATATC CAAGCATTAA ACGAAGGTTA TCAATTAATG CAATCTCGCT TACCTGAAAT 504 0 

45 

CTACGGGGAC TTTGAATTAG AGTCAACAGA TGCACTACCA CATCTATATA TGATTGGTAA 510 0 

CGATGCCATT GGATTAGGTG CAATTGCTGC AGGTTCACAA TTTATGGCGG CATATCCTAT 5160 

TACACCTGCG TCTGAAGTTA TGGAATATAT GATTGCCAAT ATATCTAAAG TAAACGGAGC 52 2 0 

50 

GGTTATTCAA ACAGAAGATG AAATTGCTGC TGTAACTATG GCTATTGGTG CAAATTATGG 5280 
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TGGATTATCT GGTATGACTG AAACGCCATT AG TC ATT ATT AATACCCAAC GAGGTGGACC 54 CO 

TTCTACTGGA TTACCTACGA AACAAGAACA GTCAGATTTA ATGCAAATGA TTTATGGTAC 546 0 

5 

ACATGGTGAT ATTCCAAAAA TTGTTGTAGC ACCAACAGAT GCAGAAGATG CATTTTATTT 5 52 0 

AACTATGGAA GCATTTAATT TAGCAGAACA ATATCAATGC CCTGTTATAG TTCTAAGTGA 55 8 0 

w TTTGCAATTA TCTTTAGGTA AACAAACTGT TGAAAAATTA GATTATAATC GTATTGAAAT 564 0 

TAAACGTGGT GAAATCATTC AATCTGATAT TGAACGTGAA GAAGATGATA AAGGTTATTT 57 CO 

CAAGCGTTAT GCGTCAACAT CCGATGGTGT TTCTCCTAGA CCTATCCCCG GTGTTAAAGG 576 0 

' 5 AGGTATTCAT CATATAACTG GTGTGGAaCa CAATGAAGAA GO TAAACCT A GTGAATCTGC 5820 

GTCAAATAGA CAACAACAAA TGGAAAAACG AATGCGTAAA ATTGAGCAGT TACTAATTGA 5880 

ATCGCCAGTA GAAGCTAACT TACAACATGA GGATGCAGAT ATTCTTTATA TCGGTTTTAT 5940 

20 

TTCTACAAAA GGTGCAATTC AAGAAGGTAG TAACCGTTTG AATCAACAAG GCATAAAAGT 6 000 

TAACACTATA CAAATTAGAC AATTGCATCC ATTCCCAACA AG CG TT ATT C AAGATGCAGT 6060 

25 TAATAAAGCG AAGAAAGTCG TTGTAGTGGA GCACAATTAT CAAGGACAAT TGGCTAGTAT 6120 

TATAAAAATG AATGTCAATA TTCATGATAA GATTGAAAAT TATACAAAGT ATGATGGGAC 6180 

ACCTTTCCTA CCACATGAAA TCGAAGAAAA AGGCAAAATA ATTGCTACTG AAATAAAGGA 6240 

30 GATGGTATAG ATGGCGACAT TTAAAGATTT TAGAAATAAT GTTAAGCCTA ACTGGTGCCC 6 300 

CGGATGTGGC GATTTCTCAG TACAAGCTGC AATTCAAAAA GCAGCCGCAA ATATAGGGTT 6360 

AGAACCTGAA GAAGTAGCTA TCATCACCGG TATAGGATGT TCTGGCCGTC TTTCAGGATA 6420 

35 

TATTAATTCT TATGGCGTTC ATTCTATTCA CGGACGTGCA TTACCTTTAG CTCAAGGTGT 64 80 

AAAAATGG CG AATAAAGATT TAACTGTTAT TGCATCGGGA GGAGATGGTG ATGGTTATGC 6540 

TATAGGTATG GGGCATACAA TCCATGCTTT AAGAAGAAAT ATGAACATGA CGTATATAGT 6 600 

40 

CATGGATAAT CAAATTTATG GTTTGACAAA GGGACAAACA TCGCCGTCAT CAGCAGTAGG 6660 

ATTTGTTACT AAAACAACGC CAAAAGGTAA TATAGAAAAA AATGTTGCGC CTTTAGAATT 6720 

^ AGTATTATCA TCTGGTGCCA CATTTGTAGC CCAAGGTTTT TCAAGCGATA TTAAAGGATT 6780 

AACAAAACTA ATTGAAGATG cAATTAATCA TGATGGATTT TCATTCGTTA ATGTCTTTTC 6840 

ACCATGTGTG ACTTATAATA AAATTAACAC ATACGATTGG TTTaAAGAAC ATTTAACAAG 6 900 

50 

TGTTGATGAc ATTGAAAATT ATGATTCTAC AGATAAACAA TTAGCGACTA AAACTGTTAT 6 960 

TGAACATGAA TCTTTAGTAA CTGGTATTGT TTATCaAGAT AAAGAAACAC CATCATATGA 7020 

ATCtCAAATT AAAGAGTTAG ATGATmCACC ACTTGCTAAA AG AG AT ATC a AAATTaCTGA 7 080 
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TGTATTTATA ACAGATCCAT TTATGCTACT CAGTTTTTTA CTATTACAAA AAATAAAGGA 7200 

GTTTTTAAAA ATGAAAGACA CATTAATGAG TATACAAATA ATTCCTAAAA CACCAAACAA 7 2 60 

5 

TGACAATGTT ATACCTTACG TAGACGAGGC GATTAAAATA ATTGACGAAT CTGGTTTGCA 732 0 

TTTTAGAGTA GGTCCGTTAG AAACGACAGT ACAAGGAAAT ATGAATGAAT GTTTAATTTT 73 3 0 

AATACAATCA TTAAATGAAC GAATGGTGGA ACTTGAATGT CCAAGTATTA TTAGCCAAGT 74 4 0 

10 

TAAGTTTTAT CATGTGCCAG ATGGCATCAC TATTGAAACT TTAACTGAAA AATATGATGA 7 50 0 

ATAACATTAA AAGTGAAGTA AACTGGATTT GAATTGGCTT GTTAGAGATG ACGTATAACT 7 560 

'5 TTAACTGTTT TTGCACTTTA TAGTTAAATT TAATATAATT ATTAAATGAT ACGGGCAAAT 7 620 

AGAAAGGATT TTGTAAAGTG AACGAAGAAC AAAGAAAAGC AAGTTCTGTA GATGTTTTAG 76 3 0 

CTGAGAGAGA TAAGAAAGCA GAAAAAGATT ATAGTAAATA TTTTGAACAT GTTTATCAGC 774 0 

20 

CGCCTAATTT AAAAGCAAGC GCAAAAAAAG AGGTnAAA 777a 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH : 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

AGATGAAGTT GTTACgAAAA TTGCGTACGC TGTTTCAGAA CATGTCAAAA TAGAAACAGG 60 

3 * TAATCCATTC TTTCAAACAT CACATAGTGG TTGTGCGACG GGCGGATCCT GTAATTGTTC 12 0 

ATTATAAAAA ACATCGAGTC AGAAAAAGGT GGTTATTGAA cCACTAACTA GCATCTGACT 180 

CGATGTTTTT ATTTATTCGG GATTGTTTGT TTGAATTGTT GTGCTAAATC TGGTCGATCT 24 0 

40 

GT CACAAT CG TGTGTGCACC TTTTTGGTAT AAATCATTCA TCAGATTTAT ACTATTTACG 3 00 

CCATAATAGC CTGGAATGAT ATT CAT AT CA TTTAACCATT TGATAAAACG AGATGAAGTC 3 60 

45 AAATCAATGC CTTTAAAATG AGTAGGCATT TGGAACGTTT GTG CTAATGG TTGGTAGTAC 42 0 

CTACCACCTA ATAAATGATA TTTTAAAAAT GCTTCTGTAA CTTCCTGTTG GCTAGCACCA 4 80 

ATTGCGACGG ATCCTTGTGC AATTTTATTA AAACGAACGA TTTGTTCTTT ATAAAAACTT 540 

50 GTCACAAGAA CGCGGTCAAA TGCTTGATTT TCTGCAATTG TATCAAACAT AATTTGTGGT 6 00 

GCGATTGAGC CTTCATAGGA TTCAGGAGCA TCTTTTAAGT CTACGTTTAT ATACATATCA 6 60 

GGATATTGCT TCAGCAACTc ATCGAAGGTT AGTATAGCTG TGTGTGCATG ACCACGATAT 72 0 
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35 



40 



AATGTATGGG CACTAACTTT TCCAGAGCCG TTCGTC3TTC TATCAACAGT TGCGTCATGA 84 0 

AAAACGATAA GCTGTTGATC TTTTGTGAGT CTCACATCTG TTTCAAAGCC ATCAACGCCT 90 0 

AATTGTTTAG CATAGTCAAA TGCAAGTTGC GTTTGCTCTG GTCTTAAAGC CATACCACCG 96 0 

CGATGCGCAA ATATATATGG TGCATTGCCT TTGAAAAAAG CAGGGATGGT TTGCTT TTT A 102 0 

GTAATCACTT TATTTTTATT GATCATTAAT AGACTACTTA AAAATCCAGC ACCGACTAGT 1080 

ACCGCATTTA AAATGTTTCT GTTTACnTTT TTCATAAAAA ATTCCTCC 1128 
(2) INFORMATION FOR SEQ ID NO: 50: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

CAAGCAAACA ATCGTCGATA AAATTGCTAA AATAATAAAA GTAATTCGAA CTTTCATCAT 6 0 

GATCATCCTT TGTTTATAGA GTCAATATAA GTATGGAATA TGTTAGGTAT ATAGTCAAAT 12 0 

GCGTCAACTA ATGGGAATTT TGGCATAGAT AGAGAATTTA AGGCAATTAA AAAGGCATCA 180 

AACAGTAATA TGCTG CTTGA TGCCCAAATG ATGACTTTAG CTAAATTGAT TAGTCACTTT 24 0 

TAAAGATAAA GAATTGTCAT GAATTAAAAC TCATGTAATG ATGTGTTACA TTTCGCAATG 300 

ATGGCTTTCA GTTATTTATC GATAACATCA CTCTTGATAC CTTTAGATTT TAAGAAATCT 360 

TTAATTTTAT CTTG TTGCTT TTTATTAACA TCACCGGCAT ATTTTG TTGG CACGTCGACA 420 

ACATTGATTT TATTTTGCGG TTGATAGCTA AGCTTTTCAA TATCTTCATC AACATTGG CG 4 80 

ATTGTACTAT TTAAAGCTTT GAAGTAATTC ATCATTAATT CAACGGGTTT CTTATATTCT 54 0 

TTAGGAATAT TGTTTTCAGT GACAAATTTC TTGAAATGCA AATCGTTTTT AACAGCTAAG 6 00 

TTAGATAAGT GGCTAAGTGT TTCTGCTTGT TTTTCAGTCA CTTTTGTTTG ACTGTCAATT 6 60 

TGTTTATCTA GTTTATGTTG CATAATATAT TTGTTATCAA GTATATCGCT ATTTACAGAC 7 20 

AAATACTTTT CTATAGCTTG CTTCATCTCT GCATCACTAA TATCACTATT TTTCTTATCT 780 

GAGTTAAAGA TATCTTTTGT tTCTAATTTT TTAGCGCTTT TAGGTGCATG GATGCCAGTA 84 0 

CTTG TAT GAT GATCTTCGTT ATCAGATTGA TCGGACGCGC AACCTGTAAG AATTAATGTC 90 0 

GATGCTAAAA ATGTACTTAG TAGTAATCTC TTTTTCATAA TGTAATATAA CTCCTTAGTT 960 

TATCTTTAAT TGAAAAAATA TGTATTCATG TTTAATAGAG TAACATTGAA TTAGTTTGGA 102 0 
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TCTATCAATA A7GCATCATT TTGGACGTTG TTAAGGA7AG CTTTATCTAT AAATAACTGC 114 0 

ATAATTGGTT GTACTAATTT AGACGTAGGT ATCGTACGTA AAAGCATAA7 AATTTCGTTC 12 0 0 

5 

ACATACTTTT CTTTCTCAAT ATCAT7TTTC ATATTGATTT GTTTGCGAGA GGTACATACT 126 0 

TTAAGCATTA TCGCACATCT CGTTGTATAT AT7AAGTTTA TCATAACATG ATTTTATGTC 13 2 0 

GGGATAAAAA AATAACAGCA TCTTAACAAA TGTAAGATAC TGTCAGTGAA ATGAATGAAA 13 8 0 

10 

CTTTAGTTTC TGaTAATATA GTCAAAGGCA TTTAATGCTG CATTTGCACC AGCGCCCATT 144 0 

GAAATGATAA TTTGTTTGTT CTTCTGATCT GTGACATCGC CAGCAGCAAA TATTCCAGGA 15 0 0 

is ACATTCGTAT TATTGTTACG ATCAATCACA ATTTCACCAC GTTCGTTTAA TTCAACAGCA 1560 

TCGTTTAACC ATGATGTGTT TGGAAGTAAA CCAATTTGAA CAAAGATACC ATCTAAGTTA 162 0 

AGTAGATGTT CTTCGCCGGT GTTCATGTCT TCGTAACGTA TACCTGTAAC ATGGTCTTCT 16 80 

20 

CCGACAACTT CAGTAGTTTT GGCATTTGTT TTGATATCAA CATTTGATAA AGAACGTAAA 174 0 

CGATCTTGTA ACACGTTGTC TGCTTTTAAT TCGCTAGCGA ATTCGAATAA TGTAACATGA 1800 

TTAACGATAC CAGCAAGGTC AATTGCTGCT TCAACCCCAG AGTTACCGCC ACCGATAACT 1860 

25 

GCTACGTCTT TATTTTCAAA TAGAGGTCCG TCACAGTGAG GGCAGAATGC AACACCTTTA 192 0 

TTAATCAATT GCTCTTCACC TGGAATGTTT AGCTTACGCC AACCTGCACC AGTAGCAATA 198 0 

30 ATGACTGTTT TACTTTCTAA GACAGCACCG TTTTCTAACG TAACTTTAAT TGCTTCGTCA 2 04 0 

GTCTTTTCGA TATCTGTAGC ACGTATACCT GTCATTG CAT CAATGTCATA TTGATCAATG 210 0 

TGCGCTGCTA AGTTAGAAGA AAATTCAGAA CCAGTTGTTT CTTTAACAGT AATGAAGTTC 216 0 

35 TCAATACCAG CAGTATCATT AACTTGGCCA CCGATACGAT CAGCAACTAT ACCAGTACGT 222 0 

AAACCTTTAC GTGCTGTGTA AATCGCTGCA CTACCACTAG CAGGACCACC ACCAACGATT 2280 

AAGACATCAT AAGGTTCTTT ATTTTCAAAC TCAGATGCAT CTGCCGTACT GCCTAGTTTC 2 34 0 

40 

GAAAGAATAT CTTGGATTGT CATACGACCA TTGCCAAATT CTTCGCCATT TAAAAAGACA 24 0 0 

GCAGGGACTG CCATGATGTT TTCAGATTCT TCACGGAACA CTGCACCATC AATCATAGAA 24 60 

45 TGCGTGATGT TAGGGTTGAT CACACTCATT AAGTTAAGTG CTTGAACGAC ATCAGGACAT 2 52 0 

TTTTGACACG TTAAACTAAT GAATGTTTCA AAATGGAATG AACCTT CTAA TTTTTTAATT 2580 

TGGTCAATGA TTGACTGTTT TTCTTTAGGT GCACGACCAC TAACCTGTAA AATTGCTAAA 2 64 0 

50 ACAAGTGAGT TAAACTCGTG ACCTAATGGA ATACCTGCAA ATGTTACACC TGTTTCTTCG 27 00 

CCAGGACGAT TGACTGAGAA ACTTGGTGTA CGTTTTAAAG ATTTTTCAGA AAGAGATAGT 2760 

CTAGGTGACA TATCAGTAAT TTCTGTCAAC AAATC7TTAA GTTCTTTGGA TTTATCATCT 2820 
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20 



25 



30 



35 



40 



TGTTGTTTTA 


AATCAGCATT 


AAGCATGGTT 


GTAATGCCTC 


CTTAGATTTT 


ACCTACTAAA 


2940 


TCTAAACCAG 


GTTGCAATGT 


TTTAGCGCCT 


TCTTCCCATT 


TAGCTGGGCA 


TACTTCGCCA 


3000 


GGGTTTTTAC 


GAACATATTG 


AGCTGCTTTG 


ATTTTGTGAG 


CTAATGTACT 


AGCGTCACGG 


3060 


CCAATTCCGT 


CAGCGTTAAT 


TTCAGATGCT 


TGTACAACAC 


CGTCTGGGTC 


GATAATGAAT 


3120 


GTACCACGTT 


GAGCTAAACC 


AGTAGCTTCA 


TCTAATACAT 


CAAAATTACG 


AGTGATTGTT 


3180 


TGTGATGGGT 


CACCAATCAT 


AGTGTAAGTG 


ATTTTGCTAA 


TTGCATCTGA 


ATGGTCATGC 


3240 


CATGCTITGT 


GTACGAAGTG 


AGTATCAGTT 


GATACTGAGA 


ATACATTTAC 


GCCTAATTTT 


3300 


TGTAATTCTT 


CATATTGGTT 


TTGTAAGTCT 


TCTAATTCAG 


TTGGACAAAC 


GAATGAGAAG 


3360 


TCAGCAGGAT 


AGAAGCATAC 


TACGCTCCAA 


GAACCTTTTA 


AATCTTCTTG 


TGTAACTTCT 


3420 


TTAAATTGAT 


ctttttttgg 


ATCGAAATCT 


TGCGCTGTAA 


ATGGTAAGAT 


TTCTTTGTTA 


3480 


ATTAATGACA 


TAAATATCTT 


CCTCCTAAGA 


ATTTAAGTAT 


GAATTAGAAC 


TATCAATTGA 


3S40 


TTGCGCTTAA 


TTATAATAAT 


TCTAATCT CT 


TAGTTAGCAT 


TATTACATTT 


TGATCCAGAA 


3600 


TAGTCAACTG 


GATAACTTTG 


TAAAGTGAAT 


GATTACTTTT 


AAAATAAAGA 


AAGATAATAT 


3660 


AAAGTGCTTT 


GATAATGGAT 


TTTGTAGTTG 


ATGATTTAAA 


AGGTTGTGTC 


TATATTTAAT 


3720 


ATCTTGATTT 


TAATGTAAAA 


AATGTAAAAA 


AAGAAGATTT 


GTATTCTCAA 


CTAAGTCAAC 


3780 


CTTATTGATA 


ATGGTATGAG 


AATATTTGTT 


CGAGATGGAT 


GAAGGTAATG 


AGTGAGAAAC 


3 840 


TGGATTTTTA 


AAGTATGAGA 


CAATATTTTA 


AAAAGTTCAA 


TTATTAACTT 


ATAAGCAAAT 


3900 


AATTGCTATA 


AAAAAGTTTG 


GACGTGTACA 


ATTGCAATAT 


GAAGATTTTA 


AATTAATTGT 


3960 


AAAGTATCGA 


GGAGTGGGTA 


ACGTGTCAGA 


ACATGTATAT 


AAT CTTGTGA 


AAAAGCATCA 


4020 


TTCTGTTAGA 


AAATTTAAGA 


ATAAACCTTT 


AAGTGAAGAC 


GTTGTTAAGA 


AATTGGTAGA 


4080 


AGCTSGACAA 


AGCGCTTCGA 


CGTCAAGTTT 


CCTGCAAGCA 


TACTCAATTA 


TTGGTATCGA 


4140 


CGATGAGAAG 


ATTAAAGAAA 


ATTTACGAGA 


AGTTTCTGGA 


CAACCTTATG 


TTGTAGAAAA 


4200 


TGGCTATTTA 


TTCGTCTTTG 


TTATTGATTA 


TTATCGTCAT 


CATTTAGTTG 


ATCAACATGC 


4260 


TGAAACTGAT 


ATGGAAAATG 


CATATGGTTC 


AACGGAAGGT 


TTGCTAGTAG 


GTG CAATCGA 


4320 


TGCAGCATTA 






A A fTPPTr A A 


VjA 1 A 1 OvjtLtVj 1 


ATGG CATTGT 


4 3 80 


CTTTTTAGGA 


TCATTAAGAA 


ATGATGTTGA 


ACGCGTTCGA 


GAAATTTTAG 


ACTTACCTGA 


4440 


CTATGTCTTC 


CCGGTATTTG 


GTATGGCAGT 


AGGGGAACCC 


GCAGATGACG 


AAAATGGTGC 


4500 


AGCCAAGCCA 


CGCTTACCAT 


TTGACCATGT 


CTTCCATCAT 


AATAAGTATC 


ATGCTGATAA 


4560 


GGAAACACAG 


TATGCACAAA 


TGGCAGATTA 


CGACCAGACA 


AT CAGCG AGT 


ACTATGATCA 


4620 
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CAAAGCAAGA TTAGATATGT TAGAACAATT GCAAAAATCA GGCTTAATAC AGCGATAgCA 4 74 0 

AGATACCAAA ATAACCCGCC CCCCTCTAGC TTAAAATGAT AAGTATAGCT AGAGGGGGCG 4 800 

GGTATTTCTT GCAATGAATT AGTGTGAAGT TAATGCAGCA TTATCATTTG AATCGAAAG7 4 860 

ATCTTT AT CC CAATGTTTAG TTAACTTGGC GGTACCTGTA CCAGCTAGCA TTGAATCGTT 4 92 0 

CACGTTTAAT GCTGTTCTAC CCATGTCAAT CAATGGTTCA ACGGAGATGA GCACGCCGGc 4 9 80 

w 

TAAAGCGACT GGCAAGTTTA ACGTTGACAA CACCAATATG GATGCAAATG TAGCCCCGCC 504 0 

ACCGACGCCA GCAACGCCGA ATGAACTAAT AATCACGACA GCGATTAACG TTACAATAAA 5100 

, 5 TTGTAAATCA ATTTCTACAT TAGCGACGGG TGCGACCATA ATTGCAAGCA TGGCAGGGTA 5160 

AATGCCTGCA CAACCATTTT GTCCAATCGA CAATCCAAAT GTCGCAGCGA AATTGGCAAT 5 22 0 

ACCTTCTGGC ACGCCTAGAC GTCTTGTTTG TGTTTGTACA TTCAATGGTA AGGCACCCGC 52 8 0 

20 GCTTGAGCGT GATGTGAATG CAAAGATTAA TACTTCCAAA GTCTTTTTAA CATAGCGAAT 5 34 0 

TGGGCTAATA CCTAACAGGC TTAAAATAAT TAAGTGAATG ATATACATCG TAATTAATGC 54 00 

AGCGTACGAT GCGATTAAGA ATTTTCCTAA AGTCCAAATG GCGCCAAAGT CACTTGTCGA 54 50 

TAATGTGTTG GC CAT AATTG CTAATACACC GTATGGCGTT AAACGTAAGA CGAACGTCAC 5 520 

AATCGCCATT ACTAGTGAAT AGATAGCGTC AATCGCACGC TTAAG CAATT CACCATGATC 5 580 

AGGTTGTTTG CGTnTACGCG TAAATAAGCA AATCCTATAA ACGAAGCAAA TATCACGACA 564 0 

GCAATCGTGG aAGTTGCACG TTGTCCaGTG AAATCTAAGA ATGGATTTTT AGGCAATAAT 57 00 

TCCAAAATTT GTTGTGGTAA CGTATGTGCT GTTAAATCTT TCGCTTGTTT AG CAATTTCG 576 0 

CTTCCACGTG CTTGTTCAGC GTTACCAAGG TTAATTGTTG ATGCATCTAA ACCAAACACC 5820 

AAGGCATACA CAACACCAAC AATCGCAGCA ATGGTGACAG TG C CAATT AA AAAGATAAAA 5 8 30 

ATGASACTAC CAATTTTAGC AAACTTTTCT CCGATTTGAA TTTTAGTGAA TGCAGCTACA 5 94 0 

ATAGAAATGA AAATTAAAGG CAT AACAAT C ATTTGCAACA ATGCAACGTA ACCTTGTCCG 6 000 

ACAATGTTGA ACCAGTCACT TGTTGATGTA ATAACATTCG AATGTGTGCC ATAAATAAGA 6 060 

TGCAATAACA CACCGAATAC TATACCAATC CCTAAAGCTG TAAACACACG TTTCGCAAAA 6120 

GATATATGTT TGCGAGCCAT CATGTGCAAT ATTACGATGA AAATCACCAA TACAATAATA 6180 

TTAATCAGTG TAAGAAAAGC ATTCATGAAC GTCACTCCTT AAATTTTTGA ATATAATTCC 6 24 0 

GACTAGTATG CT 6 252 
{2) INFORMATION FOR SEQ ID NO: 51: 



25 



30 



55 



li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 730 base pairs 
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is 



20 



25 



(C) STRANDEDNHSS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

ATCAAATCnC AAAATATTTA TTAATnAnAA GGGGATTATC CaTGTgAGAA ACAAAGTAAT 6 0 

GCTCTTTTTT TACCTCTTGT GGGTTGAAAA aTGGATCATC AGAGATAGAC TTCTTCTTTT 120 

TCGAAGATGA CATTTGATAC TTTAATCTTC TAAAACCATA ACTTGTCGCA TCAAAAATGC 180 

CTTCTTGTAC AAGTAAAATC AAAAATATGC TAATAAAAAT AATTAATGAA ACATAAAACA 240 

ATATATTTAA ATATGTAATG ATAGTATGGC TATTAAAAAG CCATATAATA AACGTTAATA 3 00 

TTGGCGTTAT TAGTGCCATT CCAAGCCATT TTTTCAACAT TTGATCACTC CCACTTATAG 3 60 

AAAACTCTTA CGCATAGTTT ACATTAAAAT CAGACATTGA GGAATGATTT TTTAATTTCT 4 20 

TCAGCTTTAT TGAAATTCTA AAATCAATCA TTCTTCATTA GTTTAAAGCA AAAAAATATT 4 80 

GATATATAGT AAATATTGTA TATATAATAT TAGTTAAGAT TTCaGAAAAT TTTGAAGGGA 54 0 

ATGGAAATTT AGAAATCGGA ATTTGTTAGA GGAGGGGATT AGATGGGGAA ATATATTTTC 6 00 

AAACGATTTA TTTATATGCT TATTTCTTTA TTTATTATTA TTACAATTAC ATTTTTCTTA 6 60 

ATGAAATTAA TGCCAGGTTC GCCATTTAAC GATG CTAAAT TAAATGCTGA ACAAAAAGAA 720 

ATTTTAAATG AAAAATATGG ATTAAATGAT CCTGCAGCTA CGCAgTATTT ACATTATTTA 7 80 

AAAAATGTTG TTACAGG CG A TTTTGGTAAT TCATTCCAGT ATCATAATCA ACCTGTGTGG 84 0 

GATTTGATTA AACCGAGACT ACTACCTTCT TTTGAAATGG GTCTTACAGC AATGTTCaTC 900 

GGTGTGATAC TGGGACTTAT TTTAGGTGTT GCAGCAGCTA CTAAACAAAA TTCTTGGGTT 960 

GACTATACAA CTACAGTTAT TTCAGTTATT GCAGTATCTG T AC CATCTTT TGTACTTGCT 1020 

GTACfTTTAC AATATGTATT TGCAGTTAAA TTAAGATGGT TCCCAGTAGC TGGATGGGAA 1080 

40 

GGTTTTTCGA CCGCGGTATT ACCGTCACTT GCATTATCTG CAG CTGTTTT AGCAACTGTC 114 0 

GCCAGATACA TAAGAGCAGA GATGATAGAG GTATTAAGTT CAGACTATAT TTTATTAGCG 12 00 

45 AGAGCTAAAG GTAATTCGAC AATGCGTGTA CTTTTTGGAC ATGCACTTAG AAATGCTTTA 1260 

ATTCCAATTA TTACAATTAT CGTTCCCATG TTAGCAAGTA TTTTAACAGG CACTTTAACA 13 20 

ATTGAAAATA TTTTTGGAGT TCCTGGATTA GGGGATCAAT TCGTACGTTC AATTACAACA 13 80 

50 

AATGATTTCT CAGTAATCAT GGCAATCACA CTATTATTTA GCACACTGTT TATCGTTTCT 1440 

ATTTTTATTG TAGATATTTT GTACGGTGTG ATAGATCCAC GAATTCGTGT TCcAAGgAGG 1500 

TAAAAAATAA TGGCTGAAAA TAAAAACAAT TTG T CGATT A ACGACGATCA TTCTAATGCA 1560 

55 
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TGAATCAGGA ACCTGAAATG CAACGAGAAA GCAAAAACTT TTGGCAAGAT GCTTGGGCTC 15 30 

AGTTAAAACG AAATAAGTTA GCTGTTGTCG GTATGATAGG TTTAATTATC ATTGTAATAT 174 0 

5 

TTGCTTTTAT CGGTCCAGTT ATAAATAAAC ATGATTATGC TGAACAAAAT GTAGAACATA 13 DO 

GAAATCTTCC GGCAAAAATA CCTGTATTAG ACAAAGTTCC ATTTTTACCT TTTGATGGTA 1350 

w AAGATGCAGA TGGCAAGGAT GCTTATAAAG CAGCAAATGC TAAAGAAAAT TATTGGTTTG 192 0 

GTACTGATCA GTTGGGTCGA GATTTATGGA CAAGAACATG GAAAGGTGCT CAAATTTCAT 19 30 

TGTTTATCGG TGTTGTTGCA GCGATGTTAG ATATTTTTAT TGGTGTTGTA TATGGTGCGA 2 04 0 

15 TTTCTGGATT CTTCGGTGGA CGTGTCGATA CGATTATGCA ACGTATACTT GAAGTCATAG 2100 

CATCTATTCC GAATTTAATT GTCGTAATTT TATTTGTATT AATTTTTGAA CCATCCATTT 2160 

GGACAATTAT ATTGGCTATG TCTATCACAG GCTGGTTAGG CATGAGCAGA GTTGTACGTG 2 22 0 

20 

GAGAATTTTT AAAATTAAAA AATCAAGAGT TTG T CAT GGC TTCGAAAACA TTGGGGGCTT 2 2 30 

CAAAATTCAA ATTGATATTT AAG CAT ATTT TACCTAATAC ATTAGGTGCT ATCGTGGTTA 2 34 0 

CATCAATGTT TACAGTACCT AGTGCTATTT TCTTCGAAGC ATTTTTAAGT TTCATTGGTA 24 0 0 

25 

TAGGTGTACC CGCACCTCAA ACATCGTTAG GGTCATTAGT AAATGATGGG CGCGCAATGT 24 6 0 

TATTAATTTA TCCACATGAA TTATTTATAC CAGCAATGAT TTTAAGTTTA TTAATTCTAT 2 52 0 

30 TCTTTTACTT ATTTAGTGAT GGATTACG7G ATGCATTTGA TCCGAAAATG CGTAAATAAA 2580 

AAGGGGGCAT AGCATATGAC TGAAAGAATA TTAGAAGTAA ATGATTTGCA TGTTTCCTTT 2 64 0 

GATATTACAG CAGGGGAAGT GCAGGCAGTG AGAGGCGTAG ATTTTTATTT GAACAAAGGG 2 700 

35 GAAACATTGG CAATTGTTGG TGAATCAGGT TCAGGTAAAT CTGTAACAAC AAAAGCAATT 2760 

ACAAAATTAT TCCAAGGGGA CACAGGAAGA ATTAAAAAGG GAGAAATTTT ATTTTTAGOG 2 32 0 

GAAGATTTAG CAAAAAAACC TGAAAATGAG TTGATTAAAT TACGTGGCAA AG AT ATTT CA 2 88 0 

40 

ATGATCTTTC AAGATCCAAT GACATCTTTA AACCCAACGA TGCAAATTGG TAAACAAGTC 2 94 0 

ATGGAACCAT TAATTAAGCA CAAAAATTAT AGTAAAGCAC AAGCTAAAAA GCGCGCATTG 3 000 

GAAATACTAA ATCTTGTAGG TTTACCAAAT GCAGAAAAAA GATTTAAAGC ATATCCTCAT 3 060 

CAATTTTCAG GTGGACAAAG GCAAAGAATT GTTATTGCAA CCGCATTAGC TTGTGAACCT 3120 

AAAGTGCTCA TTGCTGATGA ACCAACGACT GCATTAGACG TAACGATGCA GGCACAAATT 3180 

50 TTAGATTTAA TGAAAGAACT ACAACAAAAA ATCGATACAG CAATTATTTT TATAACGCAT 3 24 0 

GATTTAGGGG TTGTTGCGAA TATTGCTGAT AGAGTGGCAG TTATGTATGG TGGTCAAATG 3 300 

GTTGAAACAG GAGATGTTAA CGAAATATTT TATGATCCAA AGCATCCATA TACATGGGGA 3 3 60 
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30 



35 



40 



GGAGCGCCAC CTGATTTATT ACACCCACCT AAAGGTGATG CATTTGCGAG ACGTAGcAAT 34 30 

ATGCATTAGA TATTGATTTT AAAGTAGAAC CACCGTGGTT TAAAGTTTCA CCGACACATT 3 54 0 

T7GTGAAATC TTGGTTATTA GACGCACGTG CACCAAAAGT TGAACTACCC GAGCTGGTAA 3 600 

AACAACGTAT GAAACCGATG CCTAATAATT ATGAAAAACC ACTCAAGGTA GAAAGGGTGT 3660 

CGTTCAATGA AAAATGATGA AGTGCTATTA TCTATTAAAA ATTTAAAGCA ATATTTTAAC 3720 

GCAGGAAAGA AAAACGAAGT GgaGCGATTG AAAATATTTC GTTTGATATA TACAAAGGGG 37 80 

AAACATTAGG TTTAGTAGGA GAATCGGGGT GTGGTAAATC TACAACTGGT AAATCAATTA 3 84 0 

TTAAACTTAA TGATATTACA AGTGGAGAAA TTTTGTATGA GGGTATTGAT ATACAAAAGA 3 900 

TTCGTAAACG TAAAGATTTG CTTAAATTTA ATAAAAAGAT ACAGATGATT TTTCAAG AC C 3 960 

CATATGCGTC TTTAAATCCT AGGTTAAAAG TAATGGATAT AGTAGCTGAA GGTATTGATA 4 020 

TCCATCATTT AGCAACTGaT AAGCGTGACC GAAAAAAACG TGTCTATGaT TTACTTGaAA 4 080 

CTGTTGGATT AAGTAAAGAA CATGCCAATC GCTATCCTCA TGAATTTTCA GGTGGaCAAC 414 0 

GCCAACGTAT TGGaATTGCC CGTGcATTAG CCGTTGaACC AGAATTCATT ATCGCGGACG 4 200 

AACCAATATC GGCATTGGAT GTTTCAAT CC AAGCTCAAGT AGTTAATTTA TTATTAAAAT 4 260 

TACAACGTGA AAGAGGGATT ACGTTCCTAT TTATAG CTCA TGATCTATCA ATGGTGAAGT 4 320 

ATATTTCAGA TCGTATTGCA GTCATGCATT TTGGGAAAAT AGTTGAAATT GGACCGGCAG 4 3 80 

AAGAAATTTA TCAAAATCCA TTACACGATT ATACTAAGTC TTTATTATCA GCCATTCCAC 4 440 

AACCTGATCC TGAATCAGAA CGCAGTCGCA AACGATTTAG TTATATTGAT GATGAAGCAA 4 500 

ATAATCATTT AAGACAATTA CATGAAATTA GACCGAATCA CTTTGTCTTT AGTACTGAAG 4 5 60 

AAGAAG CGGC ACAACTACGA GAAAATAAAT TGGTGACACA AAATTAAGGG GAAGGGGGAA 4 620 

ATG^AATGAC GAGAAAATTT AGAACACTTA TTTTAATTTT GATTGCTACA ATTGCATTAA 46 80 

GTGGTTGTGC TAATGACGAT GGTATTTATT CAGATAAAGG TCAAGTATTC AGAAAAATTT 4740 

TGTCAT CAGA CTTAACATCC CTTGATACAT CATTAATAAC GGATGAAATA TCTTCTGAAG 4 800 

TGAcTGCGCA AACATTCGAA GGTTTATACA CATTAGGAAA AGGTGACAAA CCGGTGTTAG 4 860 

GTGTTGCGAA AGCTTTTCCT GAAAAGAGTA AAGATGGTAA AACTTTAAAG GTTAAATTAA 4920 

GAAGCGATGC TAAATGGAGC AATGGTGACA AAGTGACTGC ACAAGACTTT GTTTATGCTT 4 980 

GGAGAAAAAC AGTTGACCCT AAAACAGGTT CTGAATTTGC ATACATTATG GGGGACATTA 5040 

AAAATGCGAG TGATATTAGT ACTGGTAAGA AACCTGTAGA GCAATTAGGT ATCAAAGCAT 5100 

TAAATGATGA AACATTACAA ATTGAATTAG AAAAGCCGGT T CCA TAT ATT AATCAATTAT 5160 
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ACGGTACGGC AGCTGATAGA GCGGTATACA ATGGTCCaTT TAAAGTTGAT GATTGGAAAC 52 8 0 

AAGAAGATAA AACCTTACTA TCTAAAAATC AGTATTATTG GGATAAAAAG AATGTAAAAT 5 34 0 

5 

TAGATAAAGT GAATTATAAA GTTA r ITAAAG ACTTACAAGC CGGTGCATCA TTGTATGATA 54 00 

CTGAATCAGT AGATGACGCA TTTATTACTG CAGATCAAGT AAATAAATAT AAAGACAACA 54 60 

, 0 AAGGATTAAA CTTTGTGTTA ACGACTGGGA CATTTTTTGT AAAAATGAAT GAAAAACAAT 5 520 

ATCCTGATTT TAAAAACAAA AATTTAAGAT TGsTATCGCA CAAGCAATAG ATAAAAAAGG 5 58 0 

ATACGTTGAT TCAGTGAAAA ACAATGGCTC AATTCCTTCC GATACACTAA CAGCCAAAGG 564 0 

' 5 AATTGCGAAA GCGCCTAATG GCAAAGATTA TGCGAGTACC ATGAATTCGC CTTTAAAATA 5700 

TAATCCTAAA GAAGCAAGAG CACACTGGGA CAAAGCTAAA AAAGAGTTAG GTAAAAATGA 57 60 

AGTGACATTT TCAATGAACA CAGAAGATAC AC CAGATGCA AAAATATCTG CTGAATATAT 582 0 

CAAATCGCAA GTTGAGAAAA ATTTACCAGG AGTTACTTTG AAAATTAAGC AATTACCGTT 58 80 

TAAACAAAGA GTATCACTAG AACTGAGTAA CAATTTTGAA GCATCACTTA GTGGTTGGTC 594 0 

TGCAGATTAC CCTGATCCTA TGGCTTATTT AGAAACAATG ACCACAGGTA GCGCACAAAA 6000 

25 

TAATACAGAC TGGGGTAATA AAGAATATGA TCAATTACTT AAAGTAGCAA GAACCAAATT 6060 

GGCACTTCAA CCGAACGAAC GATATGAAAA CTTGAAAAAA GCAGAAGAAA TGTTCCTAGG 6120 

30 AGATGCACCG GTAGCACCAA TTTATCAAAA AGGTGTtGCA CATTTaACAA aTCCTCAAGT 6180 

AAAAGGATTA ATTtACCATA AATTTGGTCC AAATAACTCA CTTAAACATG TATATATTGA 624 0 

TAAATCGATA GATAAAGAAA CAGGTAAGAA GAAAAAATAA TATGCTTTGT AAATTAGGCT 63 00 

35 GGAGACATAT CTCCAGTCTT TTTGTGTTGG ATAAAAaCTT TGGGAATAAA AATTTAAAAT 6 360 

AAGTCGTTTT TTAAATTACT GAAATTGATT AAATGCATAA ATAACTGAAT ATTCTAAAAA 642 0 

TAAACTTGTA ATAATTTTTT CTATGAGTAA ACTAAAAAGA AAAAATTAGA TTGAAAGTAG 64 80 

JO 

GAGGCATATG TATGGGGAAG CTAATTAAAT ATATTTCAAT ACTTCTTATT GTCGTTTTAG 6 54 0 

TGTTGAGTGC TTGCGGAAAA AGCAGTAATA AAGATGAAGG AGTAAAAGAT GCTACTAAAA 6 600 

45 CGGAAACCTC AAAACATAAA GGTGGTACCT TAAATGTAGC ATTAACAGCA CCGCCAAGTG 6 660 

GTGTTTATTC TTCGTTATTA AATAGTACAC ATGCAGATTC TGTAGTTGAG GGATATTTTA 6 72 0 

ACGAAAGCTT 673 0 

50 (2) INFORMATION FOR SEQ ID NO: 52: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6482 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS : double 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

AATTTTTGTC ATTATTAAAA ACCTCGCTTT TAAAAGATTG AAAAGTAAAT GAGTGAAATT 6 0 

AAAGATTATG CACATTAAAA TCACGCCACA ATTTAATTGT GAAAAAT AT C ACAAATATAT 120 

TATAACACTA AATTTCCCAA AATTCAAAAG TGTGTTTT A T TGCAGAAAAC TTATAACAyG 180 

TGCACAAGTT ATAGTGAATT GCAAACGGAT TACTTTAGTC TTTTTAAAAC ATGAAGTATA 24 0 

ATTTGTATAG CAATAAATAT AAAAATGGGA GGCTATGTTC AATGAGCAAT ATGAATCAAA 3 00 

CAATTATGGA TGCATTTCAT TTCAGACATG CGACTAAGCA ATT CG AT CCA CAAAAGAAAG 3 60 

TTTCGAAAGA AGATTTTGAA ACAATATTAG AGTCAGGTAG ATTGTCTCCA AGTTCTCTTG 4 20 

GGTTAGAACC TTGGAAGTTT GTCGTGATTC AAGATCAAGC GTTACGTGAT GAATTAAAAG 4 80 

CGCACAGTTG GGGCGCAGCA AAACAATTAG ATACAGCGAG CCATTTTGTG CTAATTTTTG 54 0 

CGCGTAAAAA TGTAACGTCA AGATCACCGT ATGTACAACA TATGTTAAGA GATATTAAAA 6 00 

AATATGAGGC ACAAACGATT CCAGCTGTTG AACAAAAATT CGATGCATTC CAAGCAGATT 6 60 

TCCATATTTC TGATAATGAT CAAGCCTTGT ATGACTGGTC AAGTAAACAA ACGTATATTG 720 

CATTAGGCAA TATGATGACG ACAGCCGCAT TGTTAGGTAT TG ATT CATGT CCGATGGAAG 780 

GTTTTAGTCT GGATACAGTG ACAGACATTT TAG CAAATAA AGGGATCTTA GATACTGAGC 84 0 

AATTTGGTTT ATCAGTGATG GTCGCATTTG GCTACAGACA ACAAGAGCCA CCGAAAAATA 900 

AAACACGCCA AGCTTATGAA GATGTTATTG AATGGGTTGG ACCAAAAGAA TAAATAGAAT 96 0 

ACCGTATGTC TAAATATATA AAATTAAAAA GTTAGCAATA AAAAAGCCTG CGATTACATA 102 0 

AATGAATCGC AGGcTTTTGC GTGAAAAAAT TGTATTAATA AAGTATGGAT GATTATTTTT 1080 

CTGG&ACAAG GTCAGTATTT GAATGAACTG TGATGTCAAA CCCTTCTGGT GCCGTAAATG 1140 

TATGTGTTGA GGCGTCGGGT TGATAAATAT CAACATGTGT TAATCCATAA CTTTGTGAAT 1200 

TGTTTTGTCT TGCTTGATTG GATTGCCAAG TATTAGCAGC AATATGATGG TGATAATGAT 126 0 

TCGTTGACAT AAATAGCGCA CGTGGAAAAT CAGACACATG TTGGAATCCT AATTGTTCAA 132 0 

TGTAACATTG ATATGCTGCG TCTAAATCAT GTGTTTTTAA ATGTAAGTGT CCAATCATGC 138 0 

CTTTTGCTGG CATTCCTTGC CAACCTTCAT CAGTACGATG TGTTAATAAG GTTTGGCTAT 144 0 

CAACTTCTAA AGTATCCATT TTAACTTTGC CATTTTGCCA TTCCCATGAA GATGAAGGTC 1500 

TATCGCGATA GACTTCAATA CCATTACCTT CGGGGTCGTT GAAATATAAA GCTTCACTTA 156 0 

CTAAATGATC ACCAGCGCCG ATGCCCATAT TTTTTTGTGC CACGAAATAT AAGAAGTTAG 162 0 
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aAGTCTGACG GcCGTCTTCT AATAAATGTA ACG7TAGAGT ATGGcCACCA GTCCCAACAG 174 0 

ATAATACGGT TGTATTAT CG TCAGAACTTT TAACGGATAG TCCTAAAATG TTTTTGTAAA 1800 

5 

ATGTTGTCAT TAAGTCTAAG TCTCTTACGT TCAGTACAAT GTTTGTCACT TGTGTTGCTG 136 0 

TTTTATCGTG AAATGCCATT ATGCATCGCC TCTTTTTCTA TTTTTCTATA AGTTAGTATA 192 0 

AAAAGTATAC CAGAAAAGAA AATGAATTGA TAG C AT AAAG TTTGAAATGC AAAATAACTA 193 0 

10 

GTCGTTTTGC AATTTTAtAT TGATGCGAAC AAAAAAGCGA TGGTACAGTT GCACCATCGC 2 04 0 

AAAATTTATT TAACCAAGAT ATACATCTTG ATATGAATCT TCTTTTTCTA ACATATGTTT 210 0 

/5 GGCAAATGAA CATGAGGCAA TAATTTTCAA ATTATTTTCT CGAGCGTGTT CAACAACTGc 216 0 

TTTAAGTAGT TTTTTGCCAA CACCTTGACC ACCAAGTTCA TCAGATACGC CTGTATGATC 2 22 0 

AATGTTAATT TCATTATTAT CCACAAAACG GTATGTGATT T GAG CT AAAG CATTATTTTC 223 0 

20 ATCATCACCA ATATAGAATT TGTTCTCGCC TTGTTTGATT TCAAGGTTAC TCATACATAT 234 0 

CAACTCCTAT CATGATTGAT TATAGTATTT CCCTATTCTA TTTTAACTTA AACGAAGTCA 24 00 

AAGGTGCATG ACAGTCATGT GACGACATTG CCACATCTAT GTAGTCGTTT TTATTAAGCA 24 6 0 

25 

CAGTTTGAAA TGAAGATGAA AACACGTATC TTGACATTAA ATCTATTCAG C TAT AT AATT 2 52 0 

TATCTCGAAA TCGAAATAAA ATAAAAAAGT TGGTGATCAT ATGGATCGAA CGAAACAATC 253 0 

TCTCAATGTT TTTGTCGGAA TGAATAGGGC GTTAGACACA TTAGAGCAAA TTACAAAAGA 264 0 

30 

AGACGTAAAG CGATATGGCT TAAATATTAC TGAATTTGCA GTGCTCGAGT TGCTTTATAA 2 70 0 

TAAAGGTCCG CAACCAATTC AACGTATTAG AGACCGCGTA TTAATTGCAA GTAGCAGCAT 2760 

35 TT CAT ATGTT GTAAGTCAAT TAGAGGACAA AGGTTGGATT ACACGTGAAA AGGATAAAGA 2 320 

TGATAAACGT GTATATATGG CTTGTTTAAC TGAAAAAGGT CAAAGTCAAA TGGCAGATAT 2 38 0 

TTTCCCTAAG CATGCTGAGA CATTAACAAA AGCGTTTGAT GTGTTAACAA AGGATGAATT 2 94 0 

40 AACAATCTTA CAACAAGCGT TTAAGAAACT AAGTG CACAA TCTACAGAAG TGTAAGGCGT 3 00 0 

GCACTAAAAA TTTACATTAA AGTATCTCGA TTTCGAGATA AATGCACTAA AAATATAAAG 3 06 0 

AGGGTATATA AAATGATAAA TAATCATGAA TTACTAGGTA TTCACCATGT TACTGCAATG 312 0 

45 

ACAGATGATG CAGAACGTAA TTATAAATTT TTTACAGAAG TACTAGGCAT GCGTTTAGTT 318 0 

AAAAAGACAG TCAATCAAGA TGATATTTAT ACGTATCATA CTTTTTTTGC AGATGATGTA 3 24 0 

GGTTCGGCAG GTACAGACAT GACGTTCTTT GATTTTCCAA ATATTACAAA AGGGCAGGCA 3 30 0 

50 

GG AACAAATT CCATTACAAG ACCGTCTTTT AGAGTGCCTA ACGATGACGC ATTAACATAT 3 360 

TATGAACAGC GCTTTGATGA GTTTGGTGTT AAACACGAAG GTATTCAAGA ATTATTTGGT 3 420 
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TTAAATGAAG GGGTAGCACC TGGTGTACCT TGGAAGAATG GACCGGTTCC AGTAGATAAA 3 54 0 

GCGATTTATG GATTAGGCCC CATTGAAATT AAAGTAAGTT ATTTTGACGA CTTTAAAAAT 3600 

ATT7TAGAGA CTGTTTACGG TATGACAACT ATTGCGCATG AAGATAATGT CGCATTACTT 3660 

GAAGTTGGCG AAGGAGGCAA TGGTGGCCAG GTAATCTTAA TAAAAGATGA TAAAGGGCCa 3 720 

GCaGCACGTC AAGGTTATGG t GAGGTACAT CATGTGTCAT TTCGTGTGAA AGATCATGAT 3 78 0 

GCAATAGAAG CGTGGGCAAC GAAATATAAA GAGGTAGGTA TTAATAACTC AGGCATCGTT 3 84 0 

AATCGTTTCT ATTTTGAAGC ATTATATGCA CGTGTGGGGC ATATTTTAAT AGAAATTTCA 3 900 

fS ACAGATGGAC CAGGATTTAT GGAAGATGAA CCTTATGAAA CATTAGGCGA AGGGTTATCC 3 96 0 

TTACCACCAT TTTTAGAAAA TAAAAGAGAA TATATTGAAT CGGAAGTTAG AC C TTTTAAT 4 02 0 

ACGAAGCGTC AACATOGTTA ATTGGAATGA GGAGGATTTG TGATGGAACA TATTTTTAGA 4 080 

20 GAAGGACAAA ATGGTGCGCC AACACTAATA TTATTGCATG GTACAGGTGG TGATGAGTTC 4140 

GATTTATTAC CGTTAGGCGA AgcATTGAAT GAAAATTATC ACTTGTTAAG TATTAGAGGA 4 200 

CAAGTTTCAG AAAATGGGAT GAACCGTTAT TTCAAACGTC TTGGTGAAGG TGTTTATGAT 4 26 0 

25 

GAAGAAGATT TGGCATTTCG TGGACAAGAA TTGTTGACGT TCATTAAAGA AGCTGCTGaA 4 320 

CGTTATGATT TTGaTATTGA AAAAG CAGT A CTTGTTGGAT TTTCAAATGG AT CAAAT AT A 4 3 30 

GCGATTAACT TAATGTTGCG TTCAGAAGCA CCATTTAAAA AAG CATTGTT ATATGCACCG 4 44 0 

30 

TTATACCCAG TTGAAGTAAC GTCAACAAAG GATTTATCAG ATGTCAGTGT GTTGCTTTCT 4 500 

ATGGGGAAAC ATGATCCAAT TGTGCCATTA GCTGCAAGTG AACAAGTCAT TAACTTGTTT 4 560 

35 AATACACGTG GGGCACAAGT CGAAGAAGTT TGGGTGAAGG GCCATGAAAT TACAGAAACT 46 2 0 

GGATTAACGG CTGGTCAACA AATACTTGGG AAATAACAGT TCTATTAAGA AGCGGACAGA 468 0 

TGGAAAAGAT TTTTACTT T T CATCTGCCCG CTTTTTTGAT TTTGAAGTGC TGTACTAAAT 474 0 

40 TTTACAATAG TATAGATATT TTAATCGATA TGAGATTTGC CGGTAATACG CTTAATTAAA 4 800 

CCTTTATAGA GTACAGGTAT GAGTAAGATG AAACCGAACA ATCCCATAAT AGGGAATACT 4 86 0 

TTTCCAATTA ATGAAATGAa ACCGATAAAT GTACTAATAT AAGTGATGAC AGCCATTGTA 4 92 0 

45 

ATAATAATGA TGAAGTAACG TCTGCTGAAT GGAACGCTGA AACGTGACGC AAATGCATAC 4 980 

ATTAATC CAA CAA CAGT ATT GTAGATGACA AGT AT CAT AA TGACAGACAT AATAATACCA 504 0 

ATTGACGGAG ACATTTGTGT CGCTAATTTT AATGTAGGTA GATCTACGTG TTTAATTTTA 5100 

SO 

TCGAATTGAG AAATTAAACC TAGATTAATC ATCATGAGTA AAAATGTAAT GATTAAACCG 5160 

CCAATCAAGC CCCCGTATAA CGTTGAGTCA CGATATTTAA CTTTACTACC CATCACTGAT 5220 
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CCAGGTGATA ATOATTTCTG CTTATGAATC TGAGCATCAT TATT AGCGGC AGTAAAATCA 534 0 

AGATGACTTG TTGTGAAATA GTAGACCGCA ATCATAATGA CAATCGCAAT TAAAAATGGG 54 0 0 

GTAACACCGC CAAGCACAGC AA7TAAACGA TCGAATTTTA GAAACAGTGT TGCTAAAATA 54 6 0 

AAGGCGACTA ATATGAGTGC GCTCAGCCAA TACGGTAAGT TGAAACTTTG ATGAATGGTT 5 52 0 

GACGCACCAC CTGCAGTCAT AATAATAGCT AAAGACAACA TAAACATTGT TAAAATAATA 55 8 0 

w 

TCAAAACCTC TTGCAATAGA GGGGTATAAG AAATAGTTAA TTGAATCAGA ATGATTTCTG 564 0 

GACTTTAGAT GATGACCTGT ATGCATGACA ACCATTCCAC CTAAAGTAAT CAATAGTCCT 57 0 0 

, 5 GTTACAATAA TGCCTGAAAT GCTATATGCG CCATGACTTG TGAAAAACTG GAAAATTTCT 576 0 

TGACCAGTAG CAAAGCCGGC ACCAACGACA ACACCAACAA AGGCAAATGC CACAATAATG 5 82 0 

GACTCTTTTA AGATACGCAT GATTTAAAAA TGTCCCTTCG TAATTTTAAG TAATATAGAA 5 8 80 

20 AATGTAACAT ACATGTTAAT GAAAAATATA GTACTAATAT AG T ATTTTGT TAAATTGGAG 5 94 0 

TAGAAGCGAG GGTGTCGGTC ATTTCATTAA TTTATTAGTT GATTTTGCAT TTTTTTGCTG 6 000 

TAAAGTTGTT ATAATACAGT TAACAGGAAT TAGCATAGAT ACACCAATCC CCTCACTACT 6 06 0 

25 

CGCAATAGTG AGGGGATTTT TTTCGGTGTA GCTAGGTCGC CTATTTATCA TCGTGTTTGC 6120 

GTAgCaATGC GTAAACACAG T AC CACT AAA TAAGTGCACG ATACATGCAT CAAATGTCGT 618 0 

CTTTAGTcTA AGTAACGATC ATGCATTAAC ATTTTCAAAA TATCTATTTG AGCTTGAAGA 6 24 0 

30 

TCTTTACCAA TATTGGTATC ACGAATCTTC TTACGTTGTA ATTCTTTATC TACGACGCGC 6 3 00 

TTTATAGAAA GTTCATCGAT ACCTTCGGAA AGTATTTTTn CTTTAGCGTT AAATTGTTGG 6 36 0 

TGTGCAACGA GTTGCATACC GAATGAATTA TACAATAGTG TATAGCCTGC AATGCCAGTn 64 2 0 

GTTGACTGAT AAGCTTTTGA AAAGCCACCA TCAATGACAA GCATCTTTCC ATCAGCCTTG 64 8 0 

AT T 6482 
40 (2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16592 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

so 

ATTTAAGGCG ATTGCTTGTG TATTTCTCTC TTTTGTAGGC AAA CCTGCAC TCGTTCCAAA 6 0 

AAATGTAACT TCCATATATG CCCCTCCTTT TCTTCAATTC ATTTTATCAT AAAATTTGTA 120 
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AATTTTTCTA ACTTTAACGT AGACATAACT ATATAAATTT TGATAATTAC GTTATACTTA 24 0 

TCATTAATAA GTATCACATT AAACATGATA CATGAATCGA TATTTCATTT AAGACACTGC 300 

ATACAGTCGA GCATATTGTA TGACCTACTG AATGGATTAT CTTATAATAA TAAATCATAT 360 

ATCTAATTAA GAATTGAGGT TTTAATCTTG AGTACTAAAA ACAAAC A CAT CCCATGTTTA 420 

ATCACAATCT TTGGTGCACT GC 3TGACTTA AGCCATCGTA AGTnGTTTCC ATCAATATTC 4 BO 

w 

CATCTCTACC AACAAGACAA TTTAGATGAA CATATTGCCA TcATCgGTAT TGGACGTCGT 54 0 

GACATJcwnTA ATGATGATTT CCGTAATCAA GTAAAATCAT CAATTCAAAA GC ACGTAAAA 6 00 

, 5 GATACAAACA AAATTGACGC GTTTATGGAA CATGTCTTCT ATCATAGACA TGATGTTAGT 6 60 

AATGAAGAAA GCTATCAAGA ATTACTAGAT TTTAGTAATG AATTAGATAG CCAATTTGAA 72 0 

TTAAAAGGTA ATCGACTATT CTATTTAGCA ATGGCACCAC AATTCTTTGG CGTTATTTCT 7 80 

20 GATTATCTAA AATCTTCTGG TCTTACTGAT ACAAAAGGAT TTAAACGCCT TGTTATCGAA 84 0 

AAACCATTCG GTAGTGATTT AAAATCAGCC GAAGCATTAA ACAATCAAAT TCGTAAATCA 900 

TTTAAAGAAG AAGAAATTTA TCGTATTGAC CACTATTTAG GAAAAGACAT GGTTCAAAAT 96 0 

25 

ATCGAGGTAT TACGTTTTGC GAATGCGATG TTTGAACCAT TATGGAATAA CAAATATATT 1020 

TCAAACATCC AAGTTACATC TTCTGAAATA CTAGGTGTTG AAGATCGTGG TGGTTATTAT 10 80 

GAATCAAGTG GCGCGCTAAA AGATATGGTG CAAAACCACA TGTTACAAAT GGTTGcATTA 114 0 

30 

TTAGCTATGG AAGCACCTAT TAGTTTAAAT AGTGAAGATA TCCGTGCTGA GAAAGTAAAA 12 0 0 

GTACTTAAAT CACTGCGTCA TTTCCAATCT GAAGATGTTA AAAAGAACTT TGTTCGTGGT 126 0 

35 CAATATGGCG AAGGCTATAT CGATGGTAAA CAAGTTAAAG CATACCGTGA TGAAGAT CGC 1320 

GTTGCAGATG ACTCTAACAC ACCTACCTTT GTTTCAGGTA AATTAACAAT TGATAACTTT 1380 

AGATGGGCTG GTGTACCATT CTATATTCGT ACTGGTAAAC GTATGAAATC TAAAACAATT 144 0 

40 CAAGTTGTCG TTGAATTTAA AGAAGTACCA ATGAACTTAT ACTATGgAAA CTGaTAAACT 1500 

GTTAGATTCA AACCTATTAG TAATCAATAT CCAACCTAAT GAAGGTGgTA TCTTTtACAT 156 0 

CtAAATGcTA AGaAAAATAC ACAAGGTATC gAAACAGrAC CTGtCCmATT GtCTTACTCm 162 0 

45 

ATGaGCGcTC aAGaTAAAAT GaATACTGTA GATGCATATG AAAATCTATT ATTTGATTGT 16 80 

CTTAAAGGTG ATGCCACTAA CTTCACGCAC TGGGAAGAAT TAAaATCAAC ATGGAAATTT 174 0 

GTTGATGCAA TTCAAGATGA ATGGAATATG GTTGaTCCAG AATTCCCTAA CTATGAATCA 180 0 

50 

GGTACTAATG GTCCATTAGA AAGTGATTTA CTACTTGCTC GTGATGGTAA CCATTGGTGG 186 0 

GGACGATATT CAATAATTGA ATTAAAACGC ACATGTTAAA CAAAAATAAA TGAGCGAATG 192 0 
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15 



TATATTATGA AATTATATTT TACAATGCCC AAAACTATTT TAATAATCAT TGAACAAATG 2C4 0 

GGTGTATAAT TTATAGAAAT AATGTAGAAT AAAAATAAAT GATTGAATTA ATTGGAGTGA 2100 

AAGTTTTGGA CGTTATCAAG CAAATACAAC AGGCAATTGT TTATATTGAA GATCGTTTAT 216 0 

TAGAGCCTTT CAATTTGCAA GAATTAAGTG ATTACGTTGG TCTTTCGCCA TACCATCTTG 222 0 

ATCAATCATT TAAAATGATT GTCGGCTTAT CTCCAGAAGC TTATGCACGC GCGCGTAAAA 2 2 60 

TGACACTCGC TGCAAATGAT GTGATTAATG GTGCTACACG ACTTGTAGAT ATCGCTAAAA 234 0 

AATATCACTA TGCAAATTCA AATGATTTTG CAAATGATTT TAGTGATTTT CACGGCGTAT 2 4 00 

CACCTATTCA AGCCTCTACT AAAAAAGATG AATTACAAAT TCAAGAGCGA TTATATATCA 24 6 0 

AATTATCAAC TACTGAGAGA GCACCTTATC CATACAGATT AGAAGAGACA GATGATATTT 2 52 0 

CATTGGTTGG ATATGCACGA TTTATAGACA CTAAGTATTT GTCACATCCT TTTAATGTTC 25 80 

20 CGGATTTTTT AGAAGACTTG CTCATTGATG GTAAAATTAA AGAGTTACGA CGATATAATG 2 64 0 

ACGTTAGTCC ATTTGAACTA TTTGTTATTA GTTGTCCTCT TGAAAATGGT TTAGAAATAT 2 700 

TTGTAGGTGT ACCAAGTGAA CGTTATCCTG CACACTTAGA AAGTCGATTT TTACCTGGCA 2 76 0 

AACATTGTGC GAAATTCAAT TTACAAGGTG AAATTGATTA TGCAACTAAT GAAGCTTGGT 2 82 0 

ACTATATTGA ATCAAGTTTG CAGTTAACAT TGCCATATGA ACGAAATGAT TTATATGTTG 2 880 

AAGTGTACCC TCTCGATATT TCATTTAATG ACCCATTCAC TAAAATTCAG CTTTGGATTC 2 94 0 

CTGTTAAACA GAGTCCTTAT GACGAAGATT AAATAATAAA AAACAAAGAA GCCCCCTAAT 30 00 

ATATCTATAG GTCTACAAAT GGCCTTAGAT TCTATTAGGG GG CAT ATT AA TATGTTAATT 3 06 0 

TAGTTCGATA ACACATGCTT CATATGGACG TAACTGTTTT AAATTAACTT TGG CATCATA 312 0 

35 

ATTAAATAGC TTTACTTCTC CATGGCTTAA ATCAAATGGT ACAGTTAATT CTGCTTCGTG 3180 

GTTAGTAAGA TTACCTACAA TAAGAACTTG CTTTTCATTT AATGTTCTCG TGTACGCAAA 3 24 0 

40 AACTTGTGAA TTTTCAGCAT CTACTAAATC AAATTGACCA TATACGTATA CATCATTAGA 3 3 00 

CTTTCTTAAT TGAATTAAAT CTTTATAAAA TTGTAATACT GAATGCTCAT CTTCTAATTG 336 0 

TTGTGCAACA TTGATAGTTT TATAATTCGG ATTCACTGGG AACCACGGTT CACCATTTGT 34 20 

45 AAATCCTCCA TTTAACGTAT CATCCCATTG CATTGGTGTG CGAGAATTAT CTCGGTTCTC 34 8 0 

ATCTTTATAT TTCGCAAGTA AAGCGTCTAC ATCTCCACCT TGAGCTTTCA CTATTTGATA 3 54 0 

GTCATTTTTA ACAGCAACAT CGTTAAACGT TTCAATACTT TCAAATGGAT AATTCGTCAT 3 600 

50 

ACCAATTTCT TGACCTTGAT AAATGAATGG CGTACCTTGT TGCAAGAAAT AAACAGCTGC 3660 

ATGACTTGTT GCTGATTCAT ACCAATACTT GTCATCGTCA CCCCACGTCG ATACACGTCG 3 72 0 
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CCATCTATTT AATACAGATT TATACGAATT TACATCAAAG TGAGAATCAC CACTATTCCA 3 84 0 

CAGTCCCAAA TGTTCAAATT GGAATATCAT ATTAAATTTA CCATTTTCTT CCCCGACCCA 3 90 0 

GTCATCAGCA TCATCAGGGC TTACACCATT CGCTTCACCA ACAGTCATAA TGTCATACTT 3 96 0 

ACTTAATGAG CGATCTTTCA TCTCTTGTAA CCAAGTTTGT ATACCTGGCT GATTCATATC 4 02 0 

TACATCAAAT GCTGGGGCAT ATGTTTTACC CTCAGGTACA GGTAAGTCAC CCGCTTCAAA 4 0 30 

CGTCTTCTTA ATATGCGTAA TTGCATCTAC T CT AAAT C CA TCAATGCCTT TATCAAACCA 414 0 

CCAGTTCATC ATTTCAAATA CAGCATCTCT AACTTCCGGA TTACCCCAAT TCAAATCAGG 42 0 0 

TTGTTTTTTA CTGAATAAAT GGAAATAATA TTGCTCAGTA TT AG CATCAT ATTCCCATGT 42 6 0 

AG AT C CATTA AATATACTTT CCCAGTTGTT AGGTTCAGAG CCATCTGGCT TTGGATCTTG 4 320 

CCAAATGTAC CAATCACGTT TGGGATTGTC TTTACTAGAT TTGGATTCTA TAAACCAAGG 43 8 0 

ATGTT CAT CA GATGTATGAT TTACAACTAA ATCTAAAATA AGCTTCATGC CTCTATCATG 444 0 

AACACCTTTT AATAAACGAT CAAAGTCTTC CATCGTTCCA AATTCATCCA TAATCTCTTG 4 50 0 

GTAGTCACTA ATATCATAAC CATTGTCATC ATTAGGTGAT TTAAACATTG GACTGAGCCA 4 56 0 

AATGACATCG ATACCGAAAT CTTTTAAGTA GTCCAATTTA TCAATCATTC CAGGTAAATC 4 62 0 

CCCAATACCA TCGTGATTAC TATCATTAAA ACTTCTTGGA TATACTTGAT ATGCTACTGC 4 68 0 

TTCTTTCCAC CATTGCTTAT TCATTTTAAA ACTCCTTTGC TATCGCTGTG TTGATTTTCT 4 74 0 

TATTTTTAAT TCTGTATCTA TAATGACGAG TTCAATAACA TCCTGTGCTT TGTTTTTCAA 48 0 0 

TATATTTAAA ATTGCTGCAC CAGCCTGTTG ACCTAACATT CGAGGCTTGA TGTCAATACA 4 860 

GGTTTGTGGT GGTGACGCAA TTTCGGTTAA ATAAGAATCA TTGAACGTTG CTGTCATTAC 4 92 0 

ATCTTTCGGA ATTTCAATAT TAAGTTCATA TAGGACACTT AAAATCGCTA AATGTAACAT 4 980 

AGCATCTAAC GAAATGATTG CCTGTTTAAT ATTTGGGTCC TTCAAACGCG TATGTAGATT 504 0 

TTGCATGTAA TTAAAAATAA CTTCTCTTTC ATTACTAGTC TCAATAATTT GATAATTAAT 510 0 

TTTATTTTGA GAAGCTATCG TTTCAAATCC TTGAATTCTA TCTTTTGAAA CTTCAAAATT 5160 

TCCTTTTTCT GTAATAAATA TTAATTCATC TACACCTTGT TCAATAACAT GTCGTGTCAA 522 0 

ATTTTCAGAA GCTAATATAT TATCATTATC TATATGTGTA AATTGATGAT CTATATCCGA 52 8 0 

TGTAGGCTTA CCAATCACAA TAAATGGCAT GCTTTCATCA ATTAACATTT GTTTAAT CGG 534 0 

ATCATTTTCT TTTGAATAGA GCAGTATAAA CGCATCAACC ATTCGTTGTT TAATCATTTT 54 00 

ATAAACTTCA TCCATTAAAT CAT T CAT ATT ATTTGAGACT GTCGTTTGTG TACCATAGCC 54 6 0 

ATGCTGGTTA CACGTTTCAG AAATTCCTAG CAATACATTG ATGTAGAATG GATTCAGTCG 5520 
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AGTTCTAGCA GCGGTATTAG GAAAATAATT CAATTCTTCC ATAACTTTCT TCACTTTTGA 5 64 0 

AATTGTCGCT TCGCTAATAC GTTGATTTCC TTTTATAACT CTTGAAACTG TCGAAGGAGA 5700 

AACACCGGCT TTTAGTGCAA CATCTTTAAT CGTAACCATT TAATCACCTC CTGTTAATTT 5 76 0 

CTGCATCGGA AAACGCTTCC AACCACTGTA TAATACCAGT TTAGTCACAC TTTCTAAAAA 5 92 0 

AGTCAAAAGA TTTGTGCAAA CGATTGCATA AAACGATAAA AATAAAACCT TCATACTGAA S8 80 

ATTCAATCCG AAAATCAATA TAAAGGTTTG TATAAATATT AAAATCGATT GTTTAGTCAC 594 0 

TAACTGCAAA ATAGTTACCT TGGCCATCTT GAAAATTAAA TACACGTTGA CCATTCATTT 6000 

CTACTATATC ATGCCCAGTT AAACCTAAAT CATTTAATTT TGAGTATAAT GCATCAAAGT 6 060 

TTTTCTCTTT AAACATTAAA GATGGTGTTC CTAGGTTCAC TTCCGGGCTA TGCTTTTCAA 6120 

TAAATTCTTT TGCCATAATC GTCAATGACG TTTCAGCATC TTTGGTAGGT GATACTTCAA 6180 

CTGCAACATA GTCCTCAGCT AACGGTGTTT CACTTACAAC AACAAATTCT AAAGTTT C TG 6 240 

TCCAAAATGC TTTCGCTTTT TCGACATCAT CAACATATAA CATAACTTGA TTTAACTTTT 6 3 00 

CCATAAAATA GTACCTCTAT TTCTCTATAG TACATGCTAT CATAACACAG TAAATATTTT 6 360 

ATTACTTCAC AAAATG CTT A AAAATATGGC GGGATGCTTT TAAGGTCAAG GATAATACTT 6 42 0 

GTGTAATTTT TTATAGGTTG TAGCTACTCT ATCACACTCT CTTTTATATT TATCAAAAGA 64 80 

TATAAAAAAG GATAGTATCT TTCAACTATC CTTTAATCAA T ATT ATT CTT CAATCCATTG 6 54 0 

TGTATGGAAT ACGCCtTCTT TATCTTTTCT TTCGTACGTA TGAGCACCGA AGTAGTCACG 6600 

TTGTGCTTGA ATTAAGTTTG CAGGTAAATC AGCAGCACGG TAACTATCAT AGTAATTAAT 6660 

ACTTGATGAG AAACCAGGTG TTGGTACACC ATTTTGAACA CCAGTTGCGA CAACATCACG 6720 

TAACGCATCT TGATATTCAG TAACGATGTT TTTAAAGTAA GGATCTAGCA ATAAGTTTTG 6780 

TAATCCTGGA TTATTATCGT AAGCATCTTT GATCTTTTGT AAGAATTGTG CACGGATAAT 684 0 

GCAACCTTCT CTCCAAATCA TAGCTAAATC ACCAAGTTTT AAATTCCATT CATTATCTTC 6 90 0 

ACTTGCTTTA CGCATTTGcG CGAAACCTTG TGCATAAGAA CAAATTTTAC TCATATATAA 6 96 0 

TGCTTTACGA ATTTTTTCTA AAAAGTCTTT CTTGTCACCA TCAAATGATG CTTTTGGACC 702 0 

ATTTAATTCT TTAGAAGCAT TTACGCGCTC TTCTTTGaTT GAAGAGATAA AACGTGCAAA 7080 

TACAGATTCA GTAATGATTG TTAATGGAAT ACCTAATTCT AATGCGTTAA TTGAAGTCCA 714 0 

TTTTCCTGTA CCTTTTTGaC CTGCAGTATC AAGAATTTTT TCAACTAATG CTTCTTTATT 7200 

TTCATCTAAT TTCATGAAAA TATCACCAGT GATTTCAATT AAATAACTTT CTAATTCACC 726 0 

AGCATTCCAG TCTTTGAACG TTTGAGCAAT GTCTTCATGA GACATGCCTA ATAATTCTTT 73 2 0 
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CATTTTCACA TAGTGTCCAG CACCATTAGG TCCAATATAA GTAACACATG AAGCACCGTC 74 4 0 

TTTTGCCTTT GCAGCAATTG CATCAAGAAT ATCTGCAACT TTGTTATAAG CTTCTTCTTG 7 5 00 

TCCACCCGGC ATTAATGACG GACCAGTTAA CGCTCCAATT TCACCACCAG AAACGCCCAT 7 56 0 

ACCAATAAAG TTGATTGCAC TTTGTGywAA TGCTTTATTA CGTCTGATAG TATCTTGATA 76 2 0 

GTTTGTATTA CCACCATCAA TTAAAATATC TCCATCATCT AATAAAGGTA ACAAACTATC 76 3 0 

AATCGTTGCG TCCGTAGCTT TACCTGCTTG AACCATTAAT A7AAATTTTAC GTGGTTTTTC 7740 

TAAAGAATTA ACAAATTCTT CCAATGAATA CGTTGGATGA AT ATTTTTC C CTTTTGATTC 7 8 00 

TTCAACCATT AAATCAGTTT TTTCACTTGA GCGGTTAAAT ACAGATACAC TATATCCGCG 7860 

TGATTCAATA TTCCAAGCTA GGTTTTTACC CATAACGGC7 AAACCAATAA CTCCAATTTG 7 92 0 

TTGTGTCATA TTACTTACCT CACTTGTTGA TTTTTCATTA GTATTGTATC ACAAAATAGA 79 8 0 

CATACACTAC ACTAAATCAT TTCGAATGTC GCGCAACTAT TTTGATTATT TCTAACACTT 8 04 0 

GACTTGCAAG CAAGTTCAAT GATTTAATCG GCATTCTCTC ATTTGTTGTA TGGATTTTTT 8100 

CATAACCCAC TCCTAAAATG ACTGAAGGAA TACCAAATGT ATTAATAATA CTGCCGTCTG 8160 

AACCGCCACC AGAAATAATT GTATTTGCAG ATAATCCTAA ATTACGAGCA CTTTCTTGTG 82 20 

CAATTTTAAC AACCGCTTCA TTATCATTAA TTTTAAATCC TGGATAACTT TGCTCCACTG 8 2 80 

TAACTACTGC TTTCCCACCT AATTCTGATG CAGTAGTTTC AAACACATCA G T CAT ATGTT 834 0 

TGACTTGTGT TTTTATTCTT TCTGGATCGT GAGAACGTGC CTCTGCTTCT AAAATGACTT 84 00 

CATCTGCAAC AATATTCGTA GCTGAACCGC CATGAAACTT ACCAATATTG GGAGTAGTTA 84 6 0 

TTTCATCAAC TTGTCCTAAT TTCATTCGAC TAATTGcTTT CGCCGCAATA TTAATAGCAC 852 0 

TAACACCCTC TTTTGGCGTA CTTGCATGAG CCGTTTTGCC AAAAATTTTA GCTGAAATTA 8 5 80 

ACATTTGCGT CGGTGCACCT ACAACCGTAG TACCGACATC AGCACTTGCA TCAATAGCAT 864 0 

AACCAAAGTC CGCGTCCAAC AACTCTGAAT TTAATTCTTT AGCACCAATT AAACCTGATT 870 0 

CTTCTCCAAC AGTAATCACA AATTGAATTT GTCCATGTGG GATTTGTTGT TCCTTTATCA 8 760 

CTTGCAAAAC TTCAAGCATC GCTGATAATC CTGCTTTATC ATCTGCACCT AGAATAGTCG 8 820 

TACCATCAGA GTATATGTAG CCGTCATCTT TT A CAATTGG CTTTACATTA ATTG CGGGTA 38 80 

CAACAGTATC CATATGGCTC GT CAAA TATA ATTTAGGTAC TTCGCCTTCT TCGATAG T A C 8 94 0 

TATTCATTGT ACACACTAGA TTATTGGCAC CTAATTTAGG ATGTTTAGCC GCTTCATCTT 90 0 0 

CTTT AA CAT C TAACCCTAAT GCTATGAATT TTTCTTTTAA AATAGGTTGG ATTGTTGATT 906 0 

CATTCCCTGT CTCAGAATCG ATTTGTACAA GTTCAAAAAA CGTATTAAGT AATCTTTGCT 912 0 
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15 



GATGAAATAA AATGTTACAG TAATTGACGT TACACAGATT TATCAGG'ITr GTAAATTGTG 924 0 

TCATATTATT TTCAATTTAT TATATATAAT TATTGTAACT CAAACTAAGC TTTGTCAAAA 9 3 00 

ATATATTGAT TGATTTTTCA AAGATATCGT ATAATGAGGA AAATGACATA AGCAAACTTA 93 6 0 

CTCATGTTTT TTATTATATT CCTTTATGAT GATTGCTAGT TA7ATCGTCT CAAGTTAAAA 94 2 0 

GTTTTATATC TTATGTCGTA ATTATTAATA CAAAGGTTAT TCATTTGGAG GCACACAAAA 94 BO 

TGCAAAATAA AGTTTTAAGA ATTATCATTA TCGTTATGCT TGTATCAGTT GTATTAGCAT 954 0 

TGTTATTAAC GAGTATCATT CCAATTTTAT AAACTATATC TCAACTACCT ATACAAAATC 96 00 

ATACAATTAA AAATCCATCC ATTATAAACG CATGTATTAA TAAGTTATCG TATTGCAACG 96 60 

ATTACTTTCA AACATGGGTC ATACGGATGG ATTATTTTTT AAGCTACTTC ACTATGCATT 9720 

TTCAATGAAC CAAATTGCGA TTTGATTTGT AAATATTCTT CTAATTCATT TAATATTTGA 97 8 0 

20 ATAATACTTG CTCTCGAGTT AAGCGCTTTG TGTGTTGTTG GCAATGGCAG TTCATCCAAT 984 0 

TTCAAAC GCG TCTCATACAA ATTGTGTAAA CGCATTGCTG TATAGTCATT ACTATTCACA 9900 

TTTAGACCAA TTTCTTTCAG CAGTGACGCA ACATCATTTA AAAGCGGATC TTTATGACAG 996 0 

25 ATACTTTCGA TGAGCGGTTT CATTCTCATT AACAATTCCA CTTGCTCTTC TCGCATATCA 10 020 

AAATAATGAT AGTATGAATT TTCGTTTCTA ACAAAATGAT TTTTAACATC TCGGAACGCG 10080 

ATAGACTtCG CCTTTTTAAT ATTTAAAAGT AACACTTCAA ATTCAAT CGC AATGGTATCT 1014 0 

30 

7CATATTTTT CACAAATATA ACTATATTTA CTAAAAATAT CAGCAATTTG TTGCTCAATT 102 0 0 

TTACATTTGT ATTCGTCt AG TTGTTTGTCT AAACTTGGCA TCATTAAATT CaTTGTAAAT 10260 

GCAATGCTTA GTCCAATTAA CAGTAATAAT GTTTCATTAA CAATTAAATG TGCATCAATT 103 20 

35 

GATTTTGCAT TAAAAACATG AAGTAATATA ACGCAACTCG TAATGACACC TTCTTGTACT 103 80 

TTTAATACGA CAGTTAATGG TATAAATAAC AATACGATAA TACCGAGTAC AATTGGACTC 10440 

TGACCTAATA AACTAAATAT TGCTGAACCT AAAAACAATA CTAAAAAACA TGATACTAAT 10500 

CTTGAAATAA TCGCTTGTAG CGAATGTACT TTTGTATGTT TAATACATAA TACGACTAAT 10560 

ATGGCGCTTG AAGCATAATT ATCTAAACCT AACAGCTTAC TAATAATTAC ACCTAAAGTC 10620 

•*S ATACCCACTG CTGTTTTTAT TGTTCTAAAT CCAATCTTGT AAGGATTTAA CTTTAACATG 10680 

GGTTAGCGCC TCTTATCTTT CTTCACAATA TTTATTGAAT AATGTTTGTA ATTGATTAAT 10740 

TACGTTCATC ACATCATGAC CTTCGATTTG ATGTCTTTCA ATCATTTCTG TAATCTTTCC 10800 

50 

ATCTTTTACT AATGCAAATG ACGGACTTGA AGGCGCATAA CCTTCGAAGT ATTCACGCGC 108 6 0 

TCTTTGTGTC GCTTCTTTAT CTTGTCCAGC AAATACTGTC ACTAGACGAT CAGGTAATAC 10920 

55 



409 



EP0 786 519 A2 



AGAATTGATC ATAACTAGTG TTGTACCATC TTGTTTAAGA ACTTTGTCAA CATCTTCTGC 11040 

AGTAGTTAAT TGC7CATATC CCGCAGATTC AATTTCATTC CTTGCTTGTT CTACAACACC 11100 

5 

GTTCATGTAT AAATCGAAAT TCATGnCCAT AAGTTCAATC ACCTATCCCT TTATA7TTAA 11160 

ACTAtCCTCA TTCTACTAAT TAATAACATA TTGTTCAATA AACTAATCTG AATCACACCT 11220 

ATATTTAGAC ACAATTTTAA CAATATACCA AACATTATTG TGCTTAAAAT CATGGTAACT 112 30 

w 

AATTTGTTCA CATGTTTTCA TT AAT ATG TT TCAAGTATGA TGTCTTATTT TGACTTTACT 11340 

GCAAAAATGC ATTCAACCAT GTTGATTATT GTTCTTTATC TTTTTTGAAT ATATTGGACA 114 00 

is TATTTTAGTG CCAAAAAATA ATACATCCAT CGACAAGAAC AAGATAAAAC AAGTTGTCGA 114 60 

TAGATGCATC TATGTTATCA CTAATATATA TTTGTATTTT CTAAAGTATA CTGTTCGATA 11520 

CGCTGTTTAA TATGATTCAT ArATTTACCT GTTTGTAAAC CATCTAAAAT ACGATGATCA 11580 

20 ATTGAAATAC ATAAATTAAC CATGTTACGA ATTGCAATCA TATCATTAAT TACTACTGGC 11640 

TTTTTAACGA TTGATTCTAC TTGTAAAATC GCTGCTTGTG GATGATTTAT AATACCCATT 11700 

GATGATACTG AACCAAATGT ACCAGTATTA TTTACCGTAA ATGTACCGCC CTGCATATCT 11760 

25 

TCAGCTGTCA ATTGCTTATT ACGCGCTTTC G7TGCTAAAG TATTAATTTC TCTAGCTATA 11820 

CCTTTGATTG ACTTTTCGTC TGCATGCTTA ATCACAGGTA CGTATAATTT ATTTTCATCA 11880 

GCAACAGCAA TTGAAATATT AATGTCTTTA TGTAAGACAA TTTCATTTCC TTGCCAGCTA 11940 

30 

CTATTTAATA AAGGATATGC TTTTAAAGCA TCTGCTACAG CTTTTACAAA GAAAGCAAAG 12000 

AACGTTAGAT TATATCCTTC TTTATTTTTA AAGCTGTTTT TATAATGATT TCTCGTATTC 12060 

35 ACAAGATTTG TAGCATCTAC TTCAATCATC ATCCATGCAT GTGGAATCTC TGTTACACTA 1212 0 

TTAACCATAT TTTGCGCAAT TGCTTTACGC ACACCATTTA CTGGTATTGT GCTGTTTTCA 12180 

CTATTGTCTT CAGATGATTG GTTACTTGAT GTATCTACTG ATGTTGATTT TGTTTGAACT 12240 

40 TGTTTGTCAG ATTGAGCTGT GGTACCACCA TTTTCAATAA CTGACATTAT AT C CTTCTT A 12300 

GTTACACGAC CTTCAAATCC ACTACCTACA ACTTGTGATA AATCAATGTC ATGCTCTGAA 1236 0 

GCGAGTTTAA ATACAACAGG TGAAAAGCGA CCATTATTAC GTGGTTGATT TTGTTTAGCA 12420 

45 GTAGATGTCT GTTCCACTGT TGCACTAGCT TTTTTAGTAG ATTTCTGAGT ATGCTCATCC 12480 

ACTTTTGCTT GTATCTCTTC AGTTGTTTCA TTTGTCTTTT CATCAGCAGT TTCAATTTTA 12540 

CAGATAATTG TATCAATAGC TACTGTCTGC CCCGCTTCAA CTAAAATTTC TGTAATTGTT 12600 

so 

CCTGATATCG TGGAAGGGAC TTCAGCTGTC ACTTTATCTG TAATAACTTC ACATAATGGT 1266 0 

TCATATTCAT C AAT ATG AT C ACCAACAGAA ACTAACCATT GTTCAATGGT GCCTTCATGA 12720 
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AATTCACGCA TTTTATTTAA GATTTTTTCT GGATTCATCA TAATTTCATT TTCTAATACA 12 840 

GGAGAAAATG GCATAGATGG TACAt CTGGA GCAGCTAAAC GCATGATTGG TGCATCTAAA 12900 

5 

TCGAACAAGC AATGCTCTGC AATAATCGCT GACACTTCTG ACATAATACT ACCTTCTAAA 12 960 

TTATCTTCAG TTACAAGTAA AACTTTACCT GTATGTTTAG CACGATCAAT AATTGTTTCT 13 020 

TTATCTAATG GATAAACAGT TCGTAAATCA ACGACTTCAA CATTGATACC GTCTGCAGCT 13 030 

w 

AAAATATCCG CTGCTTGTAA ACAATAATTG ACCATTAATC CATAACAAAA TACTGTTAAA 1314 0 

TCTTCACCTT CACGTTTCAC ATCTGCTTTT CCTAAAGGTA CAGTGTAATA TTCTTCTGGC 13 2 00 

, 5 ACTTCTTCCT TTAAGAAACG ATAAGCTTTT TTATGCTCAA AGTACAATAC T GG AT CATTT 13 260 

GATTCGATAG ATGATAATAA AAGCCCTTTA GCATCATACG GTGTGGAAGG AATAACAATT 13320 

GTTAAACCTG GCGATGAAGC AAATATACTT TCAATACTTT GTGAATGATA TAGTCCTCCG 13 380 

20 TGAACACCGc CACCAAATGG TGCACGAATC GTTAATGGGC ATTGCCAATC ATTATTTGAA 13440 

CGATAACGCA TTTTCGCAGC TTCACTAATA ATTTGATTTG TCGCAGGTAA AATAAAATCT 13 500 

GCAAATTGAA TTTCTGCAAT TGGTCTTTTA CCTACCATAG CTGCACCAAT GGCAGTTCCA 13 560 

25 ACAATATTTG ACTCAGCTAA TGGCGTATCG ATAACTCTGT CTTCAC CAT A TTTTTGTTGC 13 620 

AGTCCTTGAG TAGTACCAAA TACGCCACCT TTTCTACCAA CATCTTCACC AAGAATAAAC 13 680 

ACATCTTTAT TTTGTTGTAA TGCTAAGTCT TGTGCCtGcG TATCGCCTCT AAATAAGATA 13 740 

30 

ATTTAGCCAT TAGTTAAGAC TCCCTTCTTC GTACACAAAT GCATAGGCTT CTTCGACACT 13 800 

TGGATATGGC GCGTCTTCAG CAGCCTTTGT CGCTTTATTG ATGATGTCTT TnATgTCCGC 13 860 

TTCTATTTCT GCCAACCAAG CATCATCGAT AATGCCAGCT GAAAGCAACT CTTTTTTGAA 13 920 

35 

CTTTTCATTG CAGTCTGCTT TTTTAAGcGT TTCACGCTCT TCTTTCGTAC G AT ATTGGT C 13 980 

GTCATCATCT GATGAATGAG CTGTCATACG ACTTGTTACT GCTTCAATCA AAGTTGAACC 14 040 

40 TTGACCAGAA ATAGCTCGAT CTCTTGCTTC TTTCATCGCT TTATACATTG CTAATGGATC 1410 0 

ATTACCATCT ACTTGTTCAC CATGTATACC GTAACCAAGT GCTCTATCCG ATAATTTTTC 14160 

AGCTGCGTAT TGTAATGAAT CAGGTACTGA AATTGCATAT TTATTATTTA TAATGACACA 14 22 0 

45 TACAAAAGGA AGTTTGTGTA CACCCGCGAA GTTTAAACCT TCATGGAAGT CACCTTGGTT 14 2 30 

TGAGCTACCT TCACCAACAG TTGCTGTTGC AATTTTCTTC TTACCATCCA TTTTTAAAGC 14 3 40 

TAAAGCAGCA CCAACAGCAT GGGGTATTTG AGTTGCTACC GGTGAACTTT GAGACAAAAT 144 00 

50 

ATTCTTAGCT CTACTACTAA AGTGTGATGG CATTTGTTTT CCACCAGAGT TAACATCGTC 14 4 60 

TTTCTTTCCA AACGCTGATA AAAACGTATC ATACGCTGAG ATACCCATAT AAGTAACGAA 14 520 
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AATCTGAGTT GCTTCTTGTC CTTGACCACT TACAACAAAT GGAATTTTAC CTGCACGGTT 14640 

CAATAACCAC AGTCTTTCAT CTATTTTTCT ACCTAAATCC ATCCATTTAT ATATTACTTT 1470 0 

TAGGTCTTCT TCGCTAAGGC CTAATGATTT ATAATCAATC ATGTTAAATC CTCCTATTTA 14 76 0 

TACGTGAATA GCTCTACTTT CTGCTTTCAA TCCTAATTCC ATCAACACTT CAGAGATGGA 14 920 

AGGATGTGCG TGTGTTGTTA GTCCTAATTC TAATGCCGAG CCATTCATGA ACTGTAACAG 149 9 0 

10 

TGATGCCTCA TTAATCAATT CTGTTACATG TGGACCAATC ATATTAATAC CCACAATTTC 14 940 

TTCAGTTGAT TGATCAATCA CCATTTCGCT ATACCCT7 CG TTTGTGTCAT GGCTATCAAT 15000 

, 5 CACTGCTTTA CCAATTGCTT TAAATGGTAC TTTAAAACTT TTAACTTTCA TTCCCTCTGC 15060 

CTTTGCTTGT TCAATGTTTA AACCGATAGA AGCAATTTCA GGTTGTGAAT AAATACACTT 1512 0 

AGGCATCATG TT AT AG TIT A CTGGGATTGG GTTCCCCTCA AACATATGAT CAACAGCCAC 1518 0 

20 AACACCTTCT TTTGATCCAA CATGTGCCAA TTGTAATTTT CCTATACAAT CACCAGCTGC 1524 0 

ATAAATATGT TTATCTTCAG TTTGTTGAAA TTCGTTCGTT AAAATATGTC CTGATGTTGa 15300 

AAGtTTTATT TTAGTGTTGT TTAAACCAAT ATCTGATGTG TTAGGTTTTC TACCAATCGA 15360 

25 TAGCAACACT TTATCTACTT TAATTATGTC TGAGGAAATT TCAAACGTAA CACCATCTTC 15420 

GTTAACATTT AT AT CATTTT CAGAAAGTTT TATTCCCTCA TAGAATTTAA CACCACGTGC 15480 

TGACAATGAT TTTTTTAATA GTTGTGAAGC TTGTTTACTT TCAGTTGGTA AAATTCTTTC 1554 0 

30 

ACCTGCTTCT ATAACTGTTA CGTCAACACC TAAATCTATC ATCAATGATG CAAATTCCAT 15600 

TCCGATAACA CCACCACCAA TAATACCAAT ACTTGATGGT AACGTCTTTA ATGATAATAT 15660 

ATCATCGCTA GATAAAATTT TATCATGATC AAATGATAAG AATGGCAACT CTGCAGGCGA 15720 

35 

AGAACCAGTT GCAATTAATA CAAATTGGTT GGGTAATAAG TCTGATTCAC CATCTTCATA 15780 

TTCGACAGAA ATTGTGCCAC TTTGAGGTGA AAATATAGAT GTACCTAGAA TACGTCCCGT 15840 

40 GCCATTATAA ATGTCAATGT GATTGTGTTG CATTAAATGC TTTACACCTT GATACATTTG 15900 

ATTAATAATG TCTTCTTTTC GTGCCAACAT ATTTTCAAAA TTAACATTAG CATCTTTGAC 15960 

ATCAACGCCA AACATTGCTG CCTGTTTTAC TGTTTGAAAT ACTTCAGCAG ATTTAAGCAG 16 020 

45 CGATTTAGTA GGAATACAAC CTTTATGGAG ACAAGTACCT CCTAATAGTT GTCGTTCTAC 16 080 

TATTGCCACT TTTTTACCTA ATTGAGACGC ACGTATCGCA GCAACATATC CTGCAGTACC 1614 0 

TCCACCGAGA ACGACTAAAT CATATTGTTT CTCTGACATG TTCTTACTCC TAACTAATGA 16200 

50 

TATATATCCA TTGAAAATTT ATTAATACAT AGTTTTCATG TCCATTAATT ACCTATTTTA 16260 

CATGATTGTC TATTTAGTTT GAATGCACAT AAATAAATCC ATAAATGAGT ATTCAACACA 16 32 0 
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TAAATCAGTA ACACTTGCAC CTGAAATCAT 7CGTGCAATT TCATCTACTT TATCATCGCT 1644 0 

AATTAACTCT TGAACTTGTG TTGTTGTACG ATCATCTTTT GATGATTTCG AAATTAATAA 16 500 

5 

ATGATGGTCG CTCATCGATG CAACTTGTGG TAAGTGAGAG ATACAAA7AA CTTGTATATA 16560 

TTCTGCTaTA TCTCGCATTT TCTCTGCCAT TT 16 592 
(2) INFORMATION FOR SEQ ID NO: 54: 

w 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13794 base pairs 
{ B) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
is (D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

20 CCAATACAAC GTAAAAAGAT TGCTTGTGTT ATTAATGAGT TAGATAAAAT AATTAAAGGA 6 0 

TTTAATAAGG AAAGAGACTA CATAAAATAT CAATGGGCTC CAAAATATAG CAAAG An TTT 12 0 

TTTATACTTT TTATGAACAT TATGTACTCA AAAGATTTTT TAAAATATCG ATTTAATTTA 180 

25 

ACATTTCTTG ATTTATCTAT CTTATATGTA ATATCATCTC GAAAAAATGA GATACTAAAT 24 0 

TTAAAAGATT TGTTTGAAAG TATTAGATTT ATGTATCCTC AAATTGTTAG GTCAGTTAAT 3 00 

AGATTAAATA ATAAAGGTAT GCTAATCAAA GAACGATCCC TTGCAGATGA AAGGATTGTG 3 60 

30 

TTAATCAAAA TAAATAAAAT ACAA7ATAAC ACTATTAAAA GCATATTCAC AGATACTTCC 4 20 

AAGATTCTCA AACCAAGAAA ATTTTTCTTT TAAATTTAAA CAGATTTACC T CT TG AT AAA 4 80 

ATAAATAAGC AATCATACTA CTTCTCAATT TAGTATAAAT AAAAATACAT AATTAACTTT 54 0 

CTTTTGTTTT TATATTATTT CAATACCCTA CTATATATCA CAACACATAA ATTAAGCATG 6 00 

ACACTCATTC AATTTAGTTC ACCATTTCGT GTTCCAATTT TACT GAG TAT CATGCTTTTA 6 60 

ATGTTATAAA CCTAATGCTT TAATAAATCG TGTTAATTCT TCT CG CAT A C TGTCATCTTT 720 

CAATGCATAT TCTATGGTAG TTTTAACGAA GCCTAATTTT TCTCCAACGT CATAACGTTC 7 80 

GCCTTCGAAG TCATATGCAT ACACTTGGTT ATCATTATTC ATACGTTCAA TCGCATCTGT 84 0 

TAACTGAATT TCGTTACCTG CGCCTTCTTT TTGCGTTTTT AAATAATCGA AAATTTCAGG 900 

CGTTAATACA TAACGTCCCA TAATAGCTAG GTTTGATGGT GCCGTACCTT GTGCTGGCTT 96 0 

TTCAACAAAC TTTTTCACTT CATACTGACG TCCGTTTTTA GTTAATGGGT CAATAATTCC 102 0 

ATAACGATGA GTATCTGCTT CCGGAACTTC TTGGACACCT ATAACTGAGT GCCCTGTTTC 1080 

TTCATAAACG TCAATCAACT GTTT CACTGC TGGCACTTCA GATTCAACAA TATCGTCACC 114 0 
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TAAACCTTTT TGTTCTTTCT GCCTTACATA AAAAATATTC GCAAGTTCCG TTGAATACTG 12 6 3 

AACTTTCTCT AGTAATTCAG ATTTACCTTT TTCTTTTAAC ACCATTTCTA ATTCTTTTTG 13 2 0 

5 ACT AT C AAAA TGATCTTCAA TCGCGCGTTT GTGGCGACCT GTCACTATAA TAATATCTTC 13 80 

AATTCCAGCT CTTGCAGCTT CTTCAACGAT ATATTGTATT GTGGGTTTAT CTAAGATAGG 14 4 0 

AAGCATTTCC TTT GG CAT CG CTTTAGTTGC TGGTAAAAAT CTAGTCCCTA AACCAGCAGC 1500 

w 

GGG AATGATT GCCTTTTTTA TTTTTTTCAA AGTTAATGTG CTCCTTTTCC TAAGTATTAA 1560 

ATCTATGTAT CAACGTCATT TTAACACTAA TTAGAACGCC TT CAT AG TGT CATTGAGTAT 1620 

GTAATTATTT CTTGGGAAAT TTGTTTTAAT TTTAAAAAAC AGGCTTACTT CATATAATTT 16 8 0 

is 

ATGAAATAAA CCTGTCAATT TTGGATTGAT TATGCTTTGT GATTCTTTTT ATTTCTGCGT 174 0 

AATAACGCTA AACCTAAAAT GCTAAATAAT CCGCCGAACA ACATGCCGTT GTTTGTTGAT 19 00 

20 TCTTCTCCAC CTGTTTCAGG TAGTTCAGAT TTCTTAGATT GTGCTTTTTT AGTTGGTACC 186 0 

ACTGCTTTAA CCTTTTCATT GATTTCAATA ACAGGTGTTA CTACTTTACC TTGTTCCACT 192 0 

GGTTTAGAAG GTTTTTTAGG TTCTTCTTTA GCAGGTGGTA TTGGTTTACC AGGTTCAGTT 1980 

25 GGTACCTCTG GCGTTGGCGG TGTTGGTGTT TCCGGCTCGC TTGGTACTTC TGGTGTCGGT 2 04 0 

GGTGTTGGTG TTTCCGGCTC GCTTGGTACT TCTGGTGTCG GTGGCGTTGG TGGCACGATT 2100 

GGAGGTGTTG TATCTTCTTC AATCGTTTGT TGACCTTCAT TATGACCACT TACTTGTGGA 2160 

30 

AGTGTATCTT CTTCAAAGTC AACACTATTG TGTCCACCGA ATTGATAATT TGGTTTATCT 2220 

TTATTTGTAT CTTCTTCAAT AATTTCAGTG TGCTTATTGA ATCCGTGAAT ATGTGGCACA 2280 

CTGTCGAAGT CGATATCAAT GATATTACCA CCTTGTTCAT ACTTAGGTTT GTCTTTCTCT 234 0 

35 

GTATCTTCTT CGAATGATTG GTTACCATTA TTTTGACCAT GAATTTGAGG TACACTATCG 24 00 

AAATCGATAT CTACGATATT GCCACCTTGT TCATATTTCG GTTTATCTTC TTCTGTGTCT 24 60 

^ TCCTCAAATG ACTGATTACC GCTATTTTGG CCACCTTCGT AACCTAATTC ACTCTTAATA 2 520 

TCCACGTGGC TATTTTCTTC GATTTCTTCA ATCACGCCAT AATTACCGTG ACCATTTTCA 2580 

GTTCCTAAAC CAGAATGAGA AATATGATGA TTGTTTTCAG TAATTTCCTC GATTGGTCCT 2640 

45 TGCGCTTGAC CATGTTCTTC AGGTAGTTCA TCTACTAGTT CAATCAGATT ACTTTCAGTC 2700 

GTATATTCTT TCGTATCTTC AATTGTTGTA TGATCGCTAA CAGCACCAGT TACAATACCT 2760 

TTTGTAGAAT CTTCGTCAAA TTCAACTAGG TTAGACTCAG TAGTAACCTG ACCACCACCT 282 0 

50 GGGTTTGTAT CTTCTTCATA TTCAACAACA TCAGCATGAT GTTTTGAATT TTCATGTGTC 28 80 

GATTCTTCAA AGTCTACA7G AATAGAATCT TCTTCAGTTT CAATGGTACC TTCTGCATGA 2 940 
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TCTTCGATTG TACCAGTCAA TTCATGCTTC TCCACTGGCG GCTCTGATTT AAATTCAAGT 3 0 60 

TCGATAGGAG TACTATGTTC TATAATAGGT TCCTTTAGTT TATCTTTGCC GTCGCCTTGA 312C 

s 

GCGTTATTAG AGTAAAATGC AACGCCATTT TTC Ca AGTTA AATTACTTGT ATAATAATAG 3 130 

TTATAATATC CAAAAAGGTG TGTTTGAAAT TCTAAGTTGC TAG C AT IT G A ATCATAATAC 3 24 0 

C r T" CATATT TTATTACATA ATTTTTACTT TGGTCTAAAT TATTAAAGTT TAAAGAATAA 3 3 00 

w 

CCACCATTAG TATCAAAATC TAAACTCATA TTATCAGTCA CATCTTCAAA TTTGCTGACA 3 36 0 

TCATCAAGCT TTGCATAnTn AgctTTCAGC TAAATCGTCT GAACCAATGT GTTTATATAC 34 20 

15 CTTAACTGTT GGATTATTAA CCCCTGGTTT ATTTCCTTTA GTTACTTGAC CAGTTACTGT 34 8 0 

CACAGAGCTT AACGACTGGT TGTTAGGTTT CATGTACGCA AAATGACTAA ATTTCCCATC 3 54 0 

TACTTTATTT AAAGTATCAA TTCGACCATT AGCTGTTACT CCCCAATTAT CTCTAACTCC 3 6 00 

20 ACCTAAATAT TGAATATTAA ATATTTTGCT AACCGTAGTC TCACCCAATT TAACTTCAAC 3 6 60 

ATTTTGGTTA CCTTTTTGCG TCACTGTTGT AGGATCAATA AATAGATTTA AAGATAATTC 3 72 0 

AGCAGTTAAA TCTTTCTTTT CTTGTACATA TTCTTTAAAC GTATATCTAA CTTTTCTTTC 3 78 0 

25 

TCCAATTATT TCTCCTGTCG CCATAACTTG ACCATCTGTA CTTTTTATCT CCGGAACTTT 3 84 0 

ACGCAGTGTT GAGATACCAT GAGTTTCAAC ATTATCGCTT AATGTGAAAT CAAAATAATC 3 900 

TCCCGCCTTA ATTCCTTCTC CAAATTTCCA TTTATATTTC AAGGTTACTC TTTCTGCGTT 3 96 0 

30 

ATGAGGATTT ACAACATTCG TATCTTGTTT ATGTCCTACA ATTTCACTAC CTTCTTCTAC 4 0 20 

TTCCACTTTA TTTGTTACAT CTGTACCTGT CGCTTTAGTT TCTTCCACTA CTTCTTTCTC 4 0 30 

35 TGCAACTGCT GTAACGTCAt TGatCTTTTC ATTCTTGGTT TAATTTCTGA GACGTTACTT 414 0 

GGTTGAGCTA TGTCAACTTG AGTTCCTGTA GTTTCCTTAT CAGCAACTTT TTCCGATGGC 420 0 

AAATCAACTC GCGAAgTTTC TACTTTTGGT GCTTGCAcAG TTTTCGGTGC TTCTTCTGTT 4 2 60 

40 GTTACTTGTG TTGATTGTGA TGGTTGCTCA GTTGATGTCG CGCTGTATGA TTGTGTTTCA 4 3 20 

TCTATTGTAT TAACGTTATT TGTAGTTGTT TGTGTTTCGC TTGCTTTACT TTCAGTAGCT 4 3 80 

GAACTCCCAC TTTCCTCTAC TGTAGTATTG TTTTGTTCCG ATGCTGCAGC TTCTTTTTCT 444 0 

45 TGTCCCATTC CAACAACGAT CATTGTTCCT AAGAATACTG AGGCCGCTCC CAATTTGTGT 4 500 

TTTCTTATGC CGTATCTAAG ATTGCTTTTC ACTATAATAT TCTCCCTTAA ATGCAAAATT 4 5 60 

CATTTATTTT TAAAACTCAA TAAATGCAAT TCTATATTGT TCGGTTTTTA AAAGCAATGA 4 620 

50 

AAAAAAGCGA GTTAATAAAA AGTTAAGATT GTTGTTAACT TTATGTATAA TGAGTTTTTT 4 630 

ATTATTTGAA ACTCACATAT ATATTGCATA CAAAGCTCTT GAACACCTTG ATATAACAGG 4 74 0 
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TAC7AAACCA TACATAATAA TCGCCTGTAC AATGCATCAT TAACAAGTCA CTGAAACGCC 4 860 

TTTCATTGTA TTAATAACGT CACTATAATT TTTATATCGT TCGGTTTTTG TTTGATTTTA 4 920 

5 ATGATTATTT ATACAAAAAC AGCCGTATTT CAAGCCGACA TTTTAAATTT AACTAAATTT 4 9 30 

GCATCTAGTT AATAATTGCA TTTATCAAAT TTGTCTTATT GATCCAATCT AATTTGTACT 504 0 

CACAAACTAG TTTAAAATTC TAACTTTATC TCTCAGTTCG TTATCAATCA TCAGACATAA 5100 

w 

ACCAATGAAG CAATCAGAAA ACACTCTAAT TTTCTATTAG AAATTTGATT TAATATAAAA 5160 

AAACAGGCTT ACTTCATATA ATTTATGAAA TAAACCCGTC AATTTTTGTT TAATTATGCT 5220 

f5 TTGTGATTCT TTTTATTTCT GCGTAATAAT GCTAAACCTA GAATGCTGAA TAATCCGCCG 52 8 0 

AACAACATAC CTTTGTTTGT TGATTCTTCT CCACCTGTTT CAGGTAGTTC AGATTTCTTA 53 4 0 

GATTGTGGTT TTTTAGTTGG TGCCACTGCT TTAACCTTTT CATTGATTTC AATAACAGGT 54 00 

20 GTTACTACTT TACCTTGTTC CACTGGTTTA GAAGGCTTTT TAGGTTCTTC TTTGG CAGGT 54 60 

GGTACTGGTT TACCAGGTTC AGCTGGTACC TCTGGTGTTG GCGGTGTTGG AGTTTCTGGC 5520 

TCACTCGGCA CTTCTGGTGT CGGTGGTGTT GGTGTTTCCG GCTCACTTGG TACTTCTGGT 55 80 

25 GTTGGTGGCG TTGGTGTTTC CGGCTCACTT GGTACTTCTG GTGTCGGTGG CGTTGGTGGC 564 0 

ACGATTGGAG GTGTTGTATC TTCTTCAATC GTTTGTTGAC CTTCATTTTG GCCGCTTACT 5700 

TTTGG AAGTG TATCTTCTTC AAAGTCAACA CTATTGTGTC CACCGAATTG ATAACTTGGT 57 60 

30 

TTATCTTTAT TTGTATCTTC TTCAATAATT TCAGTGTGCT TATTGAATCC GTGAATATGT 5620 

GGCACACTGT CGAAGTCGAT ATCAATGATG TTACCGCCAT GTTCATACTT AGGTTTGTCT 58 8 0 

TTTTCTGTAT CTTCCTCGAA TGACTGATTA CCTTTATTTT GACCATGAAT TTGAGGTACA 5 94 0 

35 

CTATCAAAAT CGaTATCTAC GATATTGCCA CCTTGTTCAT ATTTAGGTTT GTCTTCTTCT 6000 

GTGTCTTCCT CGAATGACTG GTTACCGCTA TTTTGGCCAC CTTCATAACC TAATTCACTC 606 0 

40 TTAATATCAA CGTGGCTATT TTCTTCGATT TCTTCAATCA CGTCATAATT CCCGTGACCA 6120 

TTTTCAGTTC CTAAACCAGA ATGAGAAATA TGATGATTGT TTTTAGTAAT TTCCTCGACT 6180 

GGTCCTTGTG CTTGACCATG CTCTT CAGGT AATTCATCCA CTAATTCAAT CAGATTACTT 624 0 

45 tCAGTTGTAT ATTCTTTCGT ATCTTCAACT GTTGTATGAT CGCTCACtGC GCCAGTTACA 6300 

ATACCTTTTG TAGACTCTTC GTCAAATTCA ACTAAGTTAG ACTCAGTAGT AACCTGACCA 63 60 

CCACCTGGGT TTGTATCTTC TTCATATTCA ACAACATCAG CGTGATGTTT TGAATTTTCA 64 20 

so 

TGTGTAGATT CTTCAAAGTC AATTGGATTT GATTCCTCAG AGGACTCAGT GTATCCTCCA 64 80 

ACGTGACCTG ctTCGCTATC CACAGCAGTA TGGTAATCGA TATCAATAGC TGATGAATCC 6 54 0 
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TGGTAATCAA TGTCAAGAGT TGATGAATCA TATTCCTCTT CAACAGTAGT TACTAAATTC 6 6 50 

TTATCATATT GACCTGTAAG AGTTTCTTTA ATTGTATCTT CTTTATATTC AAATTTATTA 6 7 20 

TTTTGAATAA TCGGACCATT TTTC7CATTT CCGTTCGCTT TATTACTGTA TAAAACTAAA 6 7 30 

CCATTATCCC AAGTTAAGGT ATATCCTCTA TCATAATAAT ACTTATAAAG TTGCTCTGGA 6 34 0 

TGTCCTACCA TTTGTGTTCT AAAATCAACT TCATCAGTAC CA7TTAAATA CTCTCCATCA 6 900 

w 

TAGTGAACAA CATAAGTTTT ATCTAGATTT TCTATATTCA ATGAATAGCT TCCATTATTT 6 96 0 

TGTAAATTCA AATTCCCACT CATATTACTT GTGACTTCTT 7AAATTTAGA AGTATCTGTC 7 020 

GTATTTGCAT ATACACTCTT CGCTATGTCT TCATTATTAC CCAAGTATTC AAATATCCTA 708 0 

ACTTTTGGTT GATTTCCATT CTGATTACTA CCTTTCATTA AAGTTCCAGT AACAGTCACA 714 0 

CTTGTCGTTT TACCATTATT AGGTTTAATA AATGCAACAT GCGAAAATCT ATTATTCGCT 72 00 

20 TTATTAAATG TCTCAATCGA TCCATTTAAA TTGGCATAAT AATTCCCAAT ACCATCTTTA 726 0 

TATTTAACAT CTAATTCCTT TGAAGTTTGT TCTTCATTTA GTGTTGAAGT TATAGTTTGA 73 2 0 

TTTC CATTAG TTTGTACAGT TTTAGGATCA ATAAATAAAT TAATTTCTAG TTCAGCCGTT 73 8 0 

25 ACATCAACCT TATCTTCAAT ATCATTTGTA AATGTATATC TAATCTTTCC ACCTTCTAAA 744 0 

ACTTCACCTG TCGCCATTAC GACTGAACCA TTTTTAATTT CTGGTACTTT TCTAGCAGTT 7500 

GATACGCCAT GCGTATTTAC ATTATTTGAT AAAGTAAAGT CAAAGTAGTC ACCTTGATGT 7560 

30 

AAACCATTCT CAAATTTCAA CTTATATTTT AGTACCGCTC GTTGTCCTGC ATGAGGTTCT 762 0 

ACTTTATTTG TATTGTTATG CCCCTCAATA GAACCAATTT CTACTGTAAC TTTACTTGTT 76 8 0 

ACATCTGTAC CCGTTTCCAC TTTCGCGTTA CTAGCTTCCT TAGCTTCCGC TACATCTGCT 7 74 0 

35 

GATCTTGTCA CACGTGGCTT ACTTTCTGAT GCCGTTCTTG GCTGTGCCAC TTCAACTTGT 78 00 

GTTTCTGCGA CTTGATTTTG TGTAGCCTTT TTAGGTGTTA AATCTACTTG TCTTTGATCT 78 6 0 

40 CCGCTATTGT CTTGAGATTG TGTTGTTTCC TTAACTTGAG GTTTCGCTTC TTCCTTAACT 7 920 

ACCTCTTCTT TAACTGTTTC TATATTTGCT GGTTGTGCAG TTTGTGGTGC TTGTACTGCT 7930 

TTTGGTGCTT CTTCAGTTGT TACTTGTGTT GCGTTTGACG GTTGTTCTGT TACTGTTGCG 804 0 

45 TTATATGATT GAGTTTCTTC TATATGATTA ACGTTAGTTG CAGTTGTTTG TGTTTCACTT 8100 

GTTTTATTAT CAGTAGCTGA ATTCCCATTT TCTTCTACTG TAGTTGTCTT TTGTTCTGAT 816 0 

GCTGCAGCTT CTTTGTCTTG TCC CATC CCA ACAACGATCA TTGTTCCTAA GAATACTGAT 82 2 0 

50 

GCTGCTCCCA ATTTATGTTT TCTAATGCCG TACCTAAGAT TGTTTTTCAC TATAATATCT 82 8 0 

CCCTTTAAAT GCAAAATTCA TTAATTTTTT AAACTTAATA AATGCAAGTC TATATTGTTC 8 34 0 
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20 



35 



40 



ATGTTAATTG ATAATTTTAT TATTTGAAAT ATACCTATAA ATTGTATTCA AGTCATCAGA 84 6 0 

AACCCTTGTC ACACAAGGCT TGTATTTTTT ATACTTATTT TTTAAATTAA ATTCATCATT 8 520 

ATCTAATTTA AAACAATATA CTAAACGTTT CATAATTATC GCCTGTACAA TACGCACAAA 8 580 

AACATGTCTT GAAACGCCTT TCATTACTCT AAAATACCCA ATATACTTTT TATATCGTTC 8 64 0 

GGATTCTGAG TATTTCAGAC GATTTTCTGC ATAAAAATAA ACGTGTTTCA AGGCAATATA 8700 

TTGCAATTAC CTAAAAACAC GTTTACTTAA TATTTAGTTA AACAAATAAG CTAATGAATA 876 0 

AAATGAAGAT GATACCTGAA ACGGAAATAA TCGTTTCTAA TAATGACCAT GTTAAGAATG 88 2 0 

TTTCTTTTAC AGTTAAACCA AAATATTCTT TAAACATCCA AAATCCTGCG TCATTTACAT 8880 

GAGACAAAAT CACACTACCT GCACCTATCG CAAGTACAAC TAATG CAACA TTTACATCTG 8 94 0 

ATGATTGTAA TAATGGTAAG ACAATACCTG TAGTTGAAAT CGCAGCTACT GTAGCCGAAC 9000 

CTAATGCGAT ACGTAGCACA GCTGCAACAA TCCATGCTAG TAAAATCGGA GACATCTCTG 9060 

TACCTTCAAA CATTTTAGCA ATTGTATTTC CGACACCGCC GTCAATTAAT ACTTGTTTAA 9120 

ATGTACCGCC ACCGCCAATA ATCAATAACA TCATTCCGAT TGGATAAATC GCATTCGTCA 9180 

CTGATTCCAT AATATGATTC ATCTTACGCT TTCTCATTAA TCCCATCGTA ACGATTGCAA 92 4 0 

ATAATACTGC TATTAGCATG GCTGTCCCTG CTGTTCCTAT CATATAAATG ATAGATTCAA 93 00 

AT AGATTTG T AGGTTTGTCA TGCCCAGTTA CAAGTTGCGT TATCGTAGAC ACTAACATTA 9360 

ATATGACTGG TAATGTTGCT GTTAATAAAC TCATACCAAA TCCTGGCATC TCTTGATCCG 94 2 0 

TAAATTCTTT TTGTGCACCT AACGCTGAAA TATCGCCTTC TCGTGTATAC GCAGACGGAA 94 8 0 

TCATTTTTTG TGCAcTTTGT TAAATATAGG CCCTGCAATG AGTGTAACTG GaATGGCAAT 9 54 0 

AATCATACCA TACAGTAATA CATCTCCAAC ATTTGCCTTT AATTCTTTTG CGATGACTAC 9600 

CGGTCCTGGA TGTGGTGGTA AAAAGCCATG TGTCACTGAT AAAGCTGTTA CCATAGGTAG 966 0 

TCCTAGTTTT AACACTGAAA CATTTGCGCG TTTTGCTACT GTAAATACTA ATGGAATCAG 972 0 

TAAGACTAAA CCTACTTCAA AGAACAATGC AATACCGACG ATAAATGCTG CAACAAGCAT 97 80 

TGCCCATTGT ACATGTTTTT GACCAAATTT TTGAATCAAC GTGTCTGCGA TTCGAGTTGC 984 0 

ACCACCACCA TCAGCAAGCA ATTTCCCAAG TATGGCACCT AAACCGAATA TCAGTGCAAT 9900 

GTGGCCGAGC GTACTGCCCA TTCCTTTCTC AATCGTCTCC ATAATTTTAG TCAATGGTAT 9 960 

ACCTAGCATT AACGCTGTAA TCATCGATGT GATAATTAAT GAAATAAATG TATTTAATTT 10020 

AAACCCAATA ATTAATACTA ATAAAATAAC GATACCTAAA ACAACACTGA TTAACGGCCA 10080 

TATTTCGTTA AACATGACAT TCCCCTCTTT CTCTTTTCAA TAGAATGTAA CACCGTCGTC 1014 0 
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GAGTGACGTA 
TGTTCATAAT 

5 

TAAACAGTGA 
ACGATTGAAA 
CATGAACTTT 

10 

TGACGCCATA 
TTCATTACTT 

, 5 GCAGCGCGAA 
GCATTTGCGT 
TCTGCACCTG 

20 AGACGTTTCG 
ACACCACCAT 
AATATTCTAC 

25 

GTACCGATTG 
ACCCCATCAC 
TAACGTTCTT 

30 

TTGGAAATAC 
ATCCCTGTTG 
ATGTATGTTT 

35 

ttcatccaaa 
taaatcgcat 
40 tct'gcccaag 
tgcatttgcg 

ATAACATATT 
45 ACATCAACGT 
TTTTCATCAT 
TTCATGATAA 

50 

CAACATCGTC 
CTGCATCAAT 

55 



TTTATTGTGT TTTATTTTCA 
TCTCTGTTAA AGAACGACTT 
CATTTTCTTC AATCGGCGTA 
AATCTTCAAT GTCACCTACA 
CATAACTTTC AGGAACCACT 
CTTCACTTTT CGCAAAACCA 
CAATAAGCGC AAGATAGACG 
TCATATGTTC TTTTTTATGA 
TCCAAAGCGG CGCACGTTCT 
GTTTAAGACG CTTTGCAATT 
CAGTTTCGAC TTCACTCGCT 
TATTTA CAGG ACCTCCGATG 
CTTTGTAATC AGTACGCGGT 
TGACAGCAAC TTCTCCTTTA 
TCGCACCAAT AACAAACGGT 
TCATACCTTT CAt CACATAC 
CCAGCAGTTC TAATGCCTCA 
CGGAAGCCAT TGAATAATCA 
TAATATCTGC AAACTTAGCA 
AAATCTTCGC TAATGGCGAC 
TGCCATCATG CACTTCATTT 
TAATATTATT TGTTAATCTT 
CACTAAATGA CACAAACTTA 
TAATAGTCAT TAGTACTGCA 

TTGGTGTGTG taaatcatag 

ATAAGACTGA CTTGGTACTC 
ATCCTTCTTT CTTTCATTTT 
GAAATTTAAA TGAAACGCTT 
AAACACTTGA TGATTATGAT 



GCGATATGTT GGCGTTGAAA 
AAATTGATAA AAATGGATAC 
TGATTGTTTG TGGCACCGAC 
GCTTTAAGTC CGAGCACGCA 
AACTCTGTGT CAAATATATC 
CCTGTTGCTT TTATCATCTT 
GTATACAAAT TGTAAAGAAC 
GATAAAGTTA AACCGAAGAA 
CCTGCTAAAT AGGGATGGAA 
TGAGTTAAGA CATCATAAGG 
AGCAACTCGT CGCGCAACCA 
ACGTAGTGGT CCTCTGTTAA 
TTATCTATCA CAGTACGAAT 
CCAACACTAT TG AC AC CTAA 
GTATCTTTAT TAAGCCCCAT 
GTTGTTGGAA CTAATTCCGG 
ACATCCCAAT CTAATGTTTC 
ATGATATATG TATCAAATAA 
GTACGTTGAA ATACAT CTTG 
ATAGGATGAA TCGGTGTGCC 
ATTACTGTTG CATATTTTGC 
TGATGTTGCT GATCCATCGC 
ATGTCGTCTT TATTAACTTT 
TCAAATAATT CATCTGGGTT 
CCTATTTGAT GTTTCATGAT 
GTCGTTCCAA TGTCGACACC 
AATTCAACCA AAATCCTTCA 
CTTTCAAAAT TTGACTGTCG 
GTATGCGTTC AAAATCTTGC 



ATCTGCAATT 10260 

GATCTCTTGG 10320 

CATCGATGAA 10330 

GGCACCTAAG 1044 0 

TGACATCATT 10500 

AGGTGTTTCA 10560 

ACCTTCTAAT 1062 0 

TGAACCTCTT 10680 

TATTAAACCA 10740 

ATCAACACCG 10800 

TCTCAATACG 10860 

GACATAACAA 10920 

CGCCCCAGAT 10 980 

ATTAGAAAGG 11040 

TAATGTTGCA 1110 0 

CAACATTTCC 1116 0 

TAAATTAAAC 11220 

ATGATAGAAA 112 8 0 

CCATTCATGT 113 4 0 

TGTTCGCTGG 114 0 0 

AGCGCGGTTA 114 6 0 

AATCAAGCTA 11520 

GGATTCTCTC 115 8 0 

TTCTTCTGAG 11 54 0 

AAAAGTTCCA 11700 

AATCATATAT 11760 

ATATCTTTAC 11820 

TATTGTTCCA 11B80 

GGG TTCTGTT 11940 
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AAAATGAGTT TAAATATTGA TGATTAGATG CTTTGATTAA TGTTTCATGA AATTCAAAGT 12 060 

CATGCTTCGT AAATGATTCT GCATCCTCAA ATTTTACTGC CACTTTCATC ATTTCAAGTT 12120 

GTTTCTTCAT TTCTTTTACG ATAGGTAGTC GCTCTTGATT TTTAACTCTT GAAAATGCAA 12180 

ATGACTCTAA CATCAGTCGC AAATCATACA TTTCTTTCTT TTCTTGTTCC CCAAACGGCA 12240 

ACACATGTGC ACCCATTCTT TCTAATTGGA TGAGTTGATT TTGTTGCAAT AATTTAAATG 123 00 

10 

CATCTCGAAT TGGCGAACGA CTCACATTAA ATTGCTTTGC CATTTGATTT TCAGTGAGTA 12360 

ACGTACCTTC AGCTATGTGA CCATTCACAA TGCCTAAGCG TAATTCTGCC GCGATACCTT 12420 

JS CTCCAGTTGT CATACCTTCC AACCATTTCT CTGGATATCC ATACATCATC AAAGTCACTC 124 80 

CTTCATTACA CGAC ATACTT GTATACAAGT ATGTTAATAT AGTTATTATG AGTTTGCAAG 12540 

CGCTTTCTTT ACGAGCACTA AAATAGTGAC CACCCCTTTT CGATTTAAAT TTAAAGGAAA 12 600 

20 TGGTCACTAT CACACGAATG ATTTAATTGT TATGTTGTAT GTGGGATATT TCTAATTGTT 12 6 60 

CTGTACTCAT ATGCGCTTTA GGTACTTCAA TGCAATAATG CGTTTCATGA CAGTTTGGAC 12 720 

ATTCGAATCG ACGTGTTGTC GCTGTATGTT TCGCTTTGAT AACTGCCCAC AAAGATGGTG 12780 

25 AGAATATATG CTGGCAGTTA GGACATAAAT AGGCAACCTT TTGTTGGTAA TAAAAAGTAA 12 840 

CACCAATGCC ATAACCAATC ATAAATGGTA AAGCAATTAA AAA CGGCC AT TTATTTTTCA 12900 

TCAAAATTGC ACTTATAATG CTAGAATATT GAATTATTCC TATAATACCA GCACTAATCC 12 960 

30 

AAATGTTACG ACGAATACTT TTCATTTCAG CTGATTTACT CATGACATGC TCTATGTCTT 13 020 

TTAAGTGTGT GATTGGAGAC GTCGACGCTT CATTTACGTA ATATTGAACA TTTTTAATTT 13080 

TGTTTAATAC CGCTTGTTGC TGTTTAACTT GTTGGTTAAT TTCTTGTTGT TTCATAGTTA 13140 

35 

GTAAAGTATT GAGCGTCTTC AAAGTACCTT CACCTTTTAG CAACATATCT ATATCGCTTA 13200 

ACGC&CAACC TAAATCTTTA AGCAATAAGA TTAACTCTAA TGTTTGTCGC TGTTGTTCTG 13260 

40 TATACACACG ACGCTTTCCT TCTGTAAATC CTTGTGGTTT CAAAATACCT TTGCGATCAT 13320 

AATATTGAAT CGTTCGTGTT GTCACATTGC AT AATTTTG C GAGTTCTCCA GTCGAATAGT 13 3 80 

TAGACATAGA TTCCACCTCC TATAATTACC ATAGTTGATG ACCCGACGTC ACGAGCAAGT 13440 

45 ACAATTTCCA CATTTTAAAG AAATTTATTA TACTAGGCGT CTTATTTTTA TGATTTCGTA 13 500 

CCATGTTGAT TTACAAACTC ACTCAAACTA AGTAACACAC CTACTAAACA TCTACTCTGT 13 560 

TATTTCAGAA TGAATTTGTT GTAATTTATC TTCAACTTCA GTAATCTCTG TCGCACATTC 13 620 

50 

TTTCAGTAAA TCTCGATACT TTTCCGTCTC TGCATTGTTT TTATAACGTA TTTTATGTTC 1368 0 

TAAACTTGcC CACATATCCA TACCTATCGT TCTAATTTGA ATTTCAACAG GCAATACCTC 1374 0 
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30 



(2) INFORMATION FOR SEQ ID NO: 55: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1059 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 55: 

GGATAAGTTC AGGTAAATTC ATTTCTTTTT CAATTTTGAT TTTCATTGTT TCCGCCCTTT 6 0 

TAAAATAAAG TTAGTTGCTT CTGTTCCTCA TATTCCAAAT CACTTTGCTT TATATATGTT 12 0 

TCAAGCTCTT CCGCTGTATC AAATGTCTTT TTCACACCTT GCCAACCTGG CACGATATGA 180 

CCGTGAAAGT AATAAGTGCC ATTTACTACA TGGATATGTG CCACTCGTTC GTTATCCTGA 24 0 

T ACAG AT AT C TCTTAGATCC AAAGAATTGA TTTAGGTATT CTTTACGCGC GCTATCTGTC 30 0 

ATGGTCATCA CTCCTTTTAA CAATTAGGCA GAC CAAACGA CATGCATTCG TCGTATAGCT 360 

CTTCATTACT TATGCTTGCC TTATAGTTTT CAATCACATT GCTAACTTCT TTATGACTCA 42 0 

TTGCTTTAAC TTGTTCGTCT GTATATTTTT CGCAGTCTTC TAATTCCAGT TGCTCCTGTA 4 30 

ATGACATCAC ATATT CAACT TGTCTTTGGG TTGCCATCGT TAACCCTCCC ACAAGTCAAA 54 0 

AGCTCTTTGG ACGTAAAACT TCGCCTTTGC TAAATCCTCA TGACCATTCT TTAACGGTGC 600 

TCTAGACATG TATTTGATTG CATTACCTAT TGCGAATGCT AGTTGAGGTG G AT ACTGTG C 660 

CGTAACCTGT TCGATAAAAT CTATAATTTC AATGTCGCCG TATGTGTAGT GCGCTGGTTG 72 0 

CTTAACATTG TCTTGCGCTT CGTTCATATC TACTTTTCTG TTACTGATTA CGCTCATTAT 780 

GCTTCACTCC ATTTCTTGAA CATTTGGTTA TAAGTGACAT CGAACCAGTA CGGATCACGT 84 0 

GAAT5TTTTT GTGGCGTTCC ATCATAAAGC CATGGTCTTA ATCTTCTCTT TCTTTCCTGT 90 0 

TCATATTCCG CTCTCACATT TCGTTGGTAT CGGTTCAAAA TCGCTTTTTT TCTGATTTTT 960 

TCTCTCCCTT TTTCTTCATC TTTnATtTGA CTCTnCATAT ATTCAACTTC TTCTGTAGAT 1020 

nTTGAGTCCT TTCTTCCACA CAATAATTCA nCGCCGCGC 10 5 9 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 30246 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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GAAGTAAAAG AAGAATTAAA TTTAACATTA ACAATGGATG AAATTGAATA TGTCGGGACA 6 0 

ATTGTAGGTC CTGCATATCC ACAACAGGA7 ATGTTAACTG AGTTAAATGG ATTTCGCGCA 12 0 

5 

TTAACCAAAA TCGATTGGGA AAACGTAACT ATCAATAATG AAATTACGGA TATACGCTGG 180 

ATTGATAAAG ATAATGATGC GTTGATTGCG CCTGCTGTCA AAGTTTGGAT TGAAACTTA7 24 0 

GGTGGTAAAC ATGACAAATA ATGACACCAT CATGTTACGA CATTATGTCC CACAAGAT7A 3 00 

w 

TTCGATGTTA GAAGCTTTTC AATTAAGTGA AAGTGATTTG AAGTTTGTTA AAACGCCAGA 3 60 

GGAAAATATT ACAGCTGCAA TGTCTGATAA TGAAAGGTAT CCCATCGTTG TAATGGATGG 4 20 

JS CAGGCAATGT GTGGCCTTTT TTACATTACA TCGTGGAAAA GGGGTCGCAC CATTTAGCGA 4 80 

TAACCAAGAT GCAGTATTTT TCAGGTCATT TAGTGTTGAT CAACGTTATC GTAATAGAGG 54 0 

AATAGGTAAA GTGGTAATGG AAAAATTGGC GTCATTTATC ACTTCAACAT TT CAGG AT AT 6 00 

20 TAATGAGATT GTGTTAACGG TTAATACTGA CAATCCACAT GCCATGGCAC TTTATCGCCA 66 0 

ACAAGGATAT CAATATATGG GAGATAGTAT GTTCGTCGGA AGACCTGTTC ATATTATGGC 720 

GTTAACTATA AAATAAATTA AATTTAAAAG CATCTTTACT CATCGTCGAC CACAACAATT 780 

25 AATGATGAAT AAAGGTGCTT TTTGTTATAG ATCATCGGAC AATTTACTAT AGTAAAAAGC 84 0 

GACCTAGTGA A CAATTG A CA TATATCCACA GGTCGCTTAA CTTAAGTTAT ATTGCTAGTT 900 

GCGATTAATT GAT AG A CT CA TCATTTTTGC GCTGTCGAGA TGGTCTTTTT ATTAAAAATG 96 0 

30 

CCGTAATCCA AGCCGTAATC GGAATACTGA TTGCAACGGC AATACCGCCT AAAATAATAG 1020 

AAATAAATTC TTGGGCAAAT ATTTTCGAGT TTATAATATG ACCAAATGAA TATTTAAGTT 10 80 

TGAAAAACCA AATAAATAAA GCAAGTTGGC CACCAAAAAA GGCAAGGTAA ATCGTGTTCG 114 0 

35 

CAGATGTCGC TAAAATTTCT CTACCAACAC GCATGCCAGA TTGGAATAAT TCGTATTGCG 1200 

TAACGTTgGA TTCACTTGAT GCAATTCATA AATGGGTGAA CTAATGGTAA TTGTTAAATC 1260 

4Q TATCACAGCT GCAATAACAG CAAGAATAAT AGTGAACACC ATAAATTGAA CCATATCAAT 1320 

GCCAATATTC ATTGAATACA CATATGTTTC ATCTTGTTGT TCGGTTGaAA AGCCTTGTAG 13 80 

ATGACCGAAG TAGACCGATA AATAAATGAG TGTAATCAAC AATATTGTTG TAACGATAgT 144 0 

45 GCtGgATAAA TGCaGCTTGT GTTTTAACAT TGTAACTATT GAGTACGAAT AAATTACAAG 1500 

CGCCAATAAT AATGCAGAAA AAGAATGTGA CGACATAAAT CGGTACGCCA AAAATAATCA 156 0 

ATACAATACT AATAATTAAA ATAGCGAAAT TTAAAAATAG GGTTAAATAA GAGATGAATC 1620 

50 CCTTTTTACC TCCGAAAATT ATCATCAGAA AGAGGAGCAA TAACGCCAAT ATAAATACAG 16 80 

CATTCATTGT TTCGCCCTCC TTAATGTTTC AAATATTTCC ATAAACAATA TTGTGATAGG 174 0 
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CATCGAAATA GTATAAGTCA CTGTATTGGC ATTTTTTAAA AAGATTAAAA ACATAGGTAG 186 0 

TGCACC3GAT AAATATGAGA ATAATAAGAT GTTAGTCATT GTTCCCATAA TATCTTGGCC 192 0 

GATGTTTCGC CCAGCAAGCG CCCATCTCCT CATTGAAATG TGTGGCGTAC GCTGTAAAAT 19B0 

TTCATGCATA CCACTAGCAA TTGTAATTGC AACATCCATA ATAGCGCCAA GTGAACCTAT 204 0 

TAACACTGAG GCTAGGAAGA TATCTTTCGG TGGTAATGAT AAAAAGTTCA TCGTTTCATA 2130 

TTTAATGCCT TTACCATCTG TCATATATAT GATTAATTCT GTTAAACCTA TACTCAAAAA 2160 

AGTTCCGATA ATTGTACTGG CTATGGTAAT GAGTGTACGC ATATGCCAGC CTGTAACGAG 22 20 

CAATAAAGTG AGTATTGTTG AACAGATCAT GGCAATGGTC ATGAGTAAGA ATAAATTAAT 2280 

ATTGCTATGT TGAATATGAA TGTAAATTGC GATTAATATG GCAATAGAAT TCAAGATTAA 2 34 0 

CGATAAAATC GATTGCAGTC CGACTTTGCG ACCAACCAAT AATACAGTTA ATAAGAACAA 24 0 0 

AC CAGTG ATG ATAACCGTTA AGGTATCACG CTTCTTTTCT ATAATATAAG CATCACTCGG 24 6 0 

CTTGTTAGAA ATATGTAATA ATACTTTTTC GTGTGTGCGA AATGCCTCAG AATCTGCTTG 2 520 

CGATTTGACG TACTGATGAT TAATCGTCGT CGTTTCTCCA G CAAATTG AC CATTTAATAT 2 5 80 

TTTGACTTTT AATTGATTTT TATATTTAAT ATCACGATTA TTTTGTGCAT CTTTTGTAGG 2 64 0 

TGTCGAAGAA ACATGTTTGA CATCTATAAT TTGACCAATT GGTTTGTTGT AAAAGTTCTC 2700 

ATTATTGAAT GTAAATAAAA TAGCACCAAT GAATGCGATG CAGAACAAAC CTAAAATTAT 2 750 

ATTAAATGGC TTTGTAAATA AATTTCTATA TTTCAAAAAC AAAACCCCAA TTCTATGAAT 2 820 

GAATTAATAT GGTGATTATA CGCCCTTAAT TTTTTATTTT CAAAGATATT ACTGCTAAGT 28 30 

GTAAAACGAA AATCATCATT GATAGCATCG AATTACTTAA TGG AATG TAG ACGTTTTAGT 2 940 

CATTAATTGC TGAATAAGTG TTAATAATAT GCCAATATCA CTCTTTGTAT AAGGCTC CTT 3 0 00 

TGTAATAGCA CATATCGTTC TTTTTAATTC AGTATGATCT AATTTTATAT CTATCCATGA 3 060 

TTTAGATTCT GGTAAATGTA TATTTTGTGA TGAAATGATG TAACCTTCTT TTTGACGAAG 3120 

GAGATACTGC GCAAGTGGTT GGCTACTGAT TGTGTATACA TCTGATTTAG TAATCTTGCG 3180 

CAATTGTTTT TTTACAGTTT CGGCAAATGG TGCCAAGCAA TAAATATGAC TATGCTCAAA 324 0 

CTGAATTAAT GGTGGGTGTG TCGCCATCGT AATTGGATCG TCTGAAGGCG CATATAAATG 3 300 

ATAGTGCTCT TCGAATAAAG GTAGCATATG TAATTGTTTG TGTTTACGTA TTTCTGGTGT 33 6 0 

AAGTTCCGTG AAA C C AATGT CTATATTCCC ATTTAATACG CTATTTATAA TTGTGTCATG 34 2 0 

TTCTAATAAG CTCGGTATGA CATGTGTATC ATTTTGTAAA TGAAACGTTT GGATAAGTGG 34 3 0 

TAGTAACATG TGGGATACGT CACTCTCATC ATAGCCAATG TAGATACTTT TATTTTTAGT 3 54 0 
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TTCAT7AAAT AATAATTTCC CTTCAGATGT GAGCGTAATA TTGCGTCCTT G CTTTTT AAA 3 660 

TAAAGACACA TTAAGTTCTT GTTCTAATAA TG TAATITG A CGGCTTATCG CTGATTGAGC 3 7 20 

5 

AATGTTTAGT TCAAGTGCTG TTTCGGAGAT ATGTTCTCTT TTAGCGACCT CGATAAAATA 3730 

TCTTAATTGT TTAATTTCCA TAG CG AT AT A GGCACCTCCA AAAATGAGTG TTTTGTAACT 3B4 0 

ATTATAGCAA TATTATTGAT AAATGTTCTA TTT7TTAGAT GAATATCTTC TATTTTATAT 3 9 00 

10 

ATTGAACAGA TAAATTTTTT AGATTATAGT AATTATCATT AATAACTAAT ATCAGAATAT 3 960 

TCTAAAAAAG GGGTGTGCAT CATGCACAAT GAGAAATTAA TTAAAGGCTT ATATGACTAT 4 020 

CGTGAGGAAC ATGATGCGTG TGGTATTGGT TTTTATGCGA ATATGGATAA TAAAAGGTCT 4 03 0 

15 

CACGACATCA TTGATAAATC GCTTGAAATG TTGCGACGCT TAG AT CA CAG GGGCGGGGTC 4140 

GGCGCAGATG GCATCACTGG TGATGGCGCA GGTATTATGA CTGAAATACC TTTTGCATTT 4 2 00 

20 TTCAAACAAC ATGTAACGGA CTTTGATATC CCAGGTGAAG GTGAATATGC CGTGGGGTTA 4 260 

TTTTTTTCCA AAGAACGCAT TTTAGGTTCT GAACATGAAG TAGTTTTTAA AAAATATTTT 4 3 20 

GAAGGCGAAG GGTTATCAAT TCTTGGTTAT CGTAATGTAC CAGTTAATAA AGATGCCATT 4 3 30 

2S GCTAAACATG TAG CAG AT AC GATGCCAGTC ATTCAACAAG TGTTTATTGA TATTAGGGAC 444 0 

ATTGAAGATG TTGAAAAGCG TTTGTTTTTA GCGAGAAAAC AATTAGAGTT CTATTCGACT 4 500 

CAGTGCGATT TAGAATTGTA TTTTACGAGC TTATCACGCA AAACAATTGT ATATAAAGGT 4 5 60 

10 

TGGTTACGAT CAGACCAAAT TAAAAAACTA TATACAGATT TATCGGATGA TTTATATCAA 4 620 

TCAAAGCTAG GGTTAGTGCA TTCGAGATTT AGTACGAATA CATTCCCGAG TTGGAAAAGG 4 6 30 

GCACATCCTA ACCGTATGTT AATGCATAAT GGTGAGATTA ACACGATTAA AGGTAATGTA 4 740 

35 

AACTGGATGC GAGCACGCCA ACATAAATTA ATCGAAACAT TATTTGGCGA GGATCAACAT 4 8 0O 

AAAGTGTTTC AAATTGTCGA TGAGGATGGT AGTGACTCTG CCATTGTAGA TAATGCG CT A 4 8 60 

GAGTTCTTAT CGTTAGCCAT GGAGCCAGAA AAGGCAGCGA TGTTACTCAT ACCTGAACCT 4 92 0 

40 

TGGTTATATA ATGAAGCGAA TGATGCAAAT GTACGTGCGT TTTATGAATT TTATAGTTAT 4 980 

TTAATGGAAC CGTGGGATGG TCCTACAATG ATTTCGTTCT GTAACGGTGA CAAACTTGGC 504 0 

4S GCGCTTACAG ATAGAAATGG ATTACGTCCA GGTCGTTATA CGATTACTAA AGATAACTTT 5100 

ATTGTCTTTT CATCTGAAGT GGGTGTTGTG GACGTACCTG AAAGTAATGT TGCTTTTAAA 5160 

GGTCAATTGA ATCCTGGAAA GTTATTGCTT GTTGATTTTA AACAGAATAA AGTCATTGAA 52 20 

SO AATAATGATT TAAAAGGTGC GATTGCTGGA GAATTACCAT ATAAAGCGTG GATTGATAAC 52 80 

CATAAAGTTG ACTTTGATTT TGAAAATATA CAATATCAAG ATTCGCAATG GAAAGATGAG 53 4 0 
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CAGGAACTTG TAGAAGGTAA GAAGGATCCT ATCGGTGCAA TGGGATATGA TGCGCCAATT 54 60 

GCAGTGTTGA ACGAGCGACC AGAATCACTA TTTAATTACT TTAAACAGCT GTTTGCACAA 5520 

5 

GTTACGAATC CACCAATTGA TGCGTATCGT GAAAAAATCG TAACGAGTGA ACTTTCTTAT 5 5 80 

TTAGGTGGCG AAGGTAACTT ACTAGCACCT GACGAAACGG TTTTAGATCG TATTCAATTG 564 0 

AAAAGGCCGG TATTGAATGA ATCACACTTA GCAGCGATTG ATCAGGAACA TTTTAAATTA 5700 

ACTTATTTAT CAACGGTATA TGAAGGGGAT TTGGAAGATG CGTTAGAAGC ATTAGGCCGA 57 60 

GAAGCAGTGA ATGCTGTAAA GCAAGGCGCT CAAATTCTAG TGTTAGATGA TAGTGGATTA 5820 

1S GTTGATAGCA ATGGCTTTGC AATGCCGATG TTACTCGCAA TAAGTCATGT GCATCAATTA 5880 

CTTATTAAAG CAGATTTACG TATGTCTACA AGTTTAGTCG CTAAATCTGG TGAGACACGA 5 94 0 

GAAGTGCATC ATGTTGCTTG TTTACTCGCA TATGGCGCGA ATGCAATTGT GCCATACCTA 6 000 

20 GCGCAACGTA CAGTTGAACA ACTGACATTG ACAGAAGGGT TACAAGGCAC CGTTGTCGAT 6 060 

AATGTTAAGA CATATACGGA TGTATTGTCA GAAGGTGTCA TTAAAGTAAT GGCTAAGATG 6120 

GGAATTTCGA CAGTGCAAAG TTATCAAGGG GCACAAATAT TTGAAGCGAT TGGCTTGTCT 6180 

25 CATGATGTGA TTGATCGTTA TTTTACTGGG ACACAGTCTA AGTTATCTGG TATTTCGATT 6240 

GATCAAATTG ATGCTGAAAA TAAAGCACGT CAACAAAGTG ATGATAATTA TCTTGCATCA 63 00 

GG T AGTA CAT TCCAATGGAG ACAACAAGGT CAACAT CATG CTTTTAATCC GGAATCTATT 63 6 0 

30 

TTCTTATTGC AGCACGCATG TAAAGAAAAT GACTATGCGC AATTTAAAGC ATACTCTGAA 64 2 0 

GCGGTGAACA AAAATAGAAC AGATCACATT AGACATTTAC TTGAATTTAA AGCATGTACA 64 8 0 

CCGATTGACA TCGACCAAGT TGAACCGGTA AGTGACATTG TCAAACGCTT TAAT A CAGGG 654 0 

35 

GCGATGAGTT ATGGATCGAT TTCAGCGGAA GCACATGAAA CGTTAGCACA AGCCATGAAC 6600 

CAA*rrAGGTG GAAAGAGTAA TAGTGGTGAA GGTGGCGAAG ATGCAAAACG TTATGAAGTA 6660 

CAAGTTGATG GAAGCAACAA AGTAAGTGCG ATTAAACAAG TTGCTTCTGG GCGTTTTGGT 6720 

40 

GTAACTAGTG ATTATTTACA ACATGCCAAA GAAATTCAAA TTAAAGTTGC GCAAGGTGCA 6780 

AAGCCTGGTG AAGGTGGTCA ATTACCTGGT ACTAAGGTAT ATCCGTGGAT TGCGAAGACA 6840 

45 AGAGGGTCAA CGCCAGGTAT CGGTCTGATT TCACCACCGC CA CAT CATG A TATTTATTCA 6900 

ATAGAAGATT TAGCGCAACT GATACATGAT TTGAAAAATG CGAATAAAGA TGCAGATATC 6960 

GCGGTAAAAT TAGTTTCGAA AACAGGTGTT GGTACCATTG CATCTGGGGT GGCAAAAGCA 7020 

SO TTTGCAGATA AAATTGTCAT CAGTGGTTAC GATGGTGGTA CAGGGGCTTC ACCCAAAACG 7080 

AGTATTCAGC ATGC CGGTGT TCCTTGGGAG ATTGGTTTAG CAGAAACACA TCAAACATTA 714 0 
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20 



AAAGATGTAG CGTACGCATG TGCGCTTGGA GCGGAAGAAT TTGGATTTGC AACTGCACCA 726 0 

TTAGTGGTGT TGGGCTGTAT TATGATGCGT GTATGCCATA AAGATACATG TCCAGTAGGA 73 2 0 

GTTGCAACTC AAAACAAAGA TTTACGTGCT TTA7ATAGAG GTAAAGCACA TCATGTTGTT 7 3 80 

AATTTTATGC ATTTTATTGC ACAAGAATTA AGAGAAATTT TAGCATCTTT AGGTTTGAAA 74 4 0 

CGTGTAGAAG ACTTAGTTGG AAGAACTGAT TTATTACAAC GATCATCAAC ATTAAAAGCG 75 00 

AATAGCAAAG CGGCTAGTAT TGATGTTGAA AAACTGTTAT GTCCTTTCGA TGGGCCAAAC 7 56 0 

ACAAAAGAAA TTCAACAAAA TCATAATCTT GAG CATGGAT TTGATTTAAC AAATTTATAT 76 2 0 

GAAGTAACGA AGCCATATAT TGCTGAAGGG CGTCGCTATA CAGGTAGCTT TACAGTAAAT 76 30 

AATGAACAAC GTGATGTAGG GGTTATTACA GGTAGTGAGA TTTCGAAACA ATATGGAGAA 7 74 0 

GCAGGACTTC CTGAAAATAC AATTAATGTT TATACGAATG GTCATGCTGG TCAAAGTCTT 7 3 00 

GCAGCATATG CACCGAAAGG CTTAATGATT CATCATACTG GAGATGCGAA TGACTATGTT 7 8 60 

GGTAAAGGAT TATCTGGTGG TACGGTCATT GTCAAAGCAC CTTTTGAAGA ACGACAAAAT 7 920 

GAAATTATTG CTGGTAACGT CTCATT CTAT GGTGCGACAA GTGGTAAGGC ATTTATTAAC 7990 

25 GGTAGTGCAG GAGAAAGATT CTGTATTAGA AATAGTGGTG TAGATGTTGT CGTTGAAGGT 804 0 

ATCGGCGACC ATGGATTAGA GTATATGACT GG TGGACATG TCATTAATTT AGGTGATGTA 8100 

GGTAAGAACT TCGGTCAAGG TATGAGTGGT GGTATTGCTT ACGTTATCCC GTCTGATGTA 8160 

30 GAAGCTTTTG TTGAAAATAA TCAACTAGAT ACGCTTTCGT TTACAAAGAT TAAACACCAA 8 22 0 

GAAGAAAAAG CATTCATTAA GCAAATGCTG GAAGAACATG TGTCACACAC GAATAGTACG 82 8 0 

AGAGCGATTC ATGTGTTAAA ACATTTTGAT CGCATTGAAG ATGTCGTCGT TAAAGTTATT 834 0 

35 CCTAAAGATT ATCAATTAAT GATGCAAAAA ATTCATTTGC ACAAATCATT ACATGACAAT 8400 

GAAGATGAAG CGATGTTAGC TGCATTTTAC GATGACAGTA AAACAATCGA TGCTAAACAT 84 6 0 

AAACCAGCCG TTGTGTATTA AGGAAAGGGG GAGATACGAT GGGTGAATTT AAAGGATTTA 852 0 

40 

TGAAGTATGA CAAACAGTAC TTAGGTGAAT TATCACTGGT AGACCGTTTG AAGCATCATA 8580 

AAGCATATCA ACAACGATTT ACTAAAGAAG ATGCCTCTAT CCAAGGTGCA CGATGTATGG 8 64 0 

ATTGTGGAAC GCCGTTTTGT CAAACCGGAC AACAGTATGG TAGGGAAACA ATAGGTTGTC 8700 

CAATTGGAAA CTACATTCCT GAATGGAACG ACTTAGTGTA TCATCAAGAT TTTAAAACTG 8760 

CTTATGAACG CTTAAGCGAA ACAAATAACT TTCCTGACTT TACAGGGCGT GTATGTCCTG 8 820 

CACCATGCGA AAGTGCTTGT GTGATGAAGA TTAATAGAGA ATCGATTGCG ATTAAAGGTA 8 3 80 

TTGAACGCAC AATTATTGAT GAAGCTTTTG AAAATGGTTG GGTAGCGCCG AAAGTTCCGA 8 94 0 
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CTGAAGAACT TAATCTACTA GGATATCAAG TAACTATTTA TGAACGTGCT AGAGAATCAG 90 6 0 

GCGGTTTATT AATGTATGGT ATTCCGAATA TGAAACTTGA TAAAGATGTG GTTCGACGTC 912 0 

5 GTATTAAGTT AATGGAAGAA GCGGGCATTA CTTTCATTAA TGGTGTTGAA GTCGGTGTTG 918 0 

ATATTGATAA AGCAACGTTA GAATCTGAGT ATGATGCCAT TATATTATGT ACTGGTGCAC 924 0 

AAAAAGGTAG AGATTTACCT 7TAGAAGGAC GCATGGGTGA TGGTA7ACAT TTCGCTATGG 9 3 00 

w 

ATTATTTAAC TGAACAAACG CAGTTGTTAA ATGGAGAAAT TGATGATATA ACAATAACTG 9 3 60 

CAAAAGATAA GAATGTCATT ATCATTGGTG CTGGTGATAC AGGGG CAGAC TGTGTAGCGA 94 20 

CAGCATTAAG AGAAAATTGT AAATCGATTG TTCAATTTAA TAAATATACG AAATTGCCAG 94 80 

15 

AAGCAATTAC ATTTACAGAA AATGCATCAT GGCCTTTAGC AATGCCGGTG TTTAAAATGG 954 0 

ACTATGCGCA CCAAGAGTAC GAAG CTAAGT TTGGTAAGGA ACCACGTGCA TATGGTGTTC 9600 

AAACAATGCG TTACGATGTT GACGATAAAG GACACATACG TGGTTTGTAT ACTCAAATTT 966 0 

20 

TAGAGCAAGG CGAAAATGGT ATGGTCATGA AAGAAGGACC TGAAAGATTT TGGCCTGCTG 972 0 

ACCTTGTATT ATTATCAATC GGCTTCGAAG GTACAGAACC AACAGTACCG AATGCTTTTA 97 80 

25 ACATTAAAAC GGATAGAAAT CGAATCGTGG CGGATGATAC AAACTATCAA ACTAATAATG 9 84 0 

AAAAGGTATT TGCTGCTGGA GATGCTAGAC GTGGTCAAAG TTTAGTTGTA TGGGCAATTA 9 90 0 

AAGAAGGTAG AGGCGTAGCG AAAGCAGTAG ATCAGTATTT AGCTAGTAAA GTTTGTGTAT 9 96 0 

30 AATCTTTGTA TGGAAATGGT GGTTACGTTG ACGTTGTGAC ATGCTGAATC GAGTTTGAAA 10 020 

AAATCTAGTA TCTATCAACG TCACATGCCA TCTTTGTAAC CTAAAAACAA AGGTTTGTAA 10080 

GACAACAAAT AGATTAATTA TAAG TAGTGA TTTTTTACAT TCGTTTATAG GTCAACTGTA 1014 0 

35 GTGGAAGACA ATGATTTGTG GTAATCATGT AATGCTTAAA AACAATATTG ACTTTTACAG 10200 

AACGTTCATA TATGATAAAT ATTGTGTTTA GGAGGAATAC CCAAGTCCGG CTGAAGGGAT 102 6 0 

CGGTCTTGAA AACCGACAGG GGCTTAACGG CTCGCGGGGG TTCGAATCCC TCTTCCTCCG 10320 

40 

CCATCAATAT TTATATTAAA TTCTATATAT AATGAAGGTA AGTGCTCAAA TTTTGAGTAT 103 80 

TTAC C TTTTT TATTTGTCTT TGAATGGCTC GTAATTTTTG ATAATAGAAA TGATAAGGCA 10440 

TTGAGATTGG AAGGGCATTT GGCTTGTGCA ATATACATAG CTAAATGTCT TTTTTGTTTT 1050 0 

45 

GTGAAATATG ATGGATGGCT TGTGTGGACA AGTTTGCTAT TTATAGATAT GCATTTTTCA 10 56 0 

ATT7AGGAGT TGGCCATGCA TCTACACTTT ATAATGGTGA GAGCGTGGTG AGGTATTGTT 10620 

AATAACGCAA TTGTAGCGAG GAGTTATTGC TACATATGTC GTTATGGCTC ATTGATTTTC 106 80 

50 

TGAAATGGCT ACCCCAGATA ATTGTGACAA AATAAAAATA TTTTGTTGAA AGCCTTTACA 10 740 
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TAAAAAGAGA AGATGTAAAA GCCATCGTAA CCGCTATTGG GGGAAAAGAA AATCTTGAAG 10860 

CTGCAACGCA TTGTGTAACA CGATTACGTT TAGTGCTGAA GGATGAAAGT AAAGTTGATA 10 920 

5 AAGACGCATT AAGTAATAAC GCGTTGGTCA AGGGGCAGTT TAAAG C AG AC CATCAATATC 10 980 

AAATTGTCAT TGGTCCAGGA ACAGTCGATG AAGTGTATAA GCAGTTTATT GATGAAACAG 1104 0 

GTGCTCAAGA AGCTTCGAAA GATGAAGCGA AACAAGCAGC TGCACAAAAA GGG AATCCAG 11100 

10 TACAACGTTT GATCAAATTG TtGGGGGATA TTTTTATACC AATATTACCT GCGATTGTGA 11160 

CAGCTGGTTT G TTAATGGG A AT CAATAATT TACTTACAAT GAAAGGTTTA TTTGGTCCAA 11220 

AAGCACTTAT TGAGATGTAT CCACAAATTG CTGATATTTC AAACATCATT AATGTGATTG 1128 0 

15 

CGAGTACGGC ATTTATTTTC TTACCAGCAT TAATTGGTTG GAGTAGTATG CGTGTATTTG 11340 

GTGGTAGTCC GATTCTAGGC ATAGTCTTAG GTTTGATTTT AATGCATCCG CAATTAGTAT 114 00 

CTCAGTATGA TTTGGCAAAA GGGAATATTC CGACGTGGAA CTTATTTGGC TTAGAGATTA 114 60 

20 

AGCAGTTGAA TTACCAAGGT CAAGTGTTGC CAGTtTTAAT TGCAGCTTAC GTTCTAGCTA 11520 

AAATTGAAAA AGGATTAAAT AAAGTCGTTC ACGATTCGAT AAAAATGTTG GTCGTTGGAC 11S80 

CCGTAGCGCT TTTAGTTACT GGATTTTTAG CATTTATTAT CATTGGACCA GTTGCGTTAT 11640 

25 

TGaTTGGTAC AGGTATTACA TCTGGTGTTA CATTTATATT CCAACATGCA GGATGGCTTG 11700 

GCGGAGCAAT ATATGGATTG TTATATGCAC CACTTGTAAT T A CAGG ACT A CACCATATGT 11760 

30 TTTTAGCAGT AGATTTCCAA TTGATGGGTA GCAGCTTAGG CGGTACGTAT TTATGGCCAA 1182 0 

TTGTTGCGAT TTCCAATATT TGTCAGGGCT CTGCAGCATT TGGAGCATGG TTTGTCTATA 11830 

AACGTCGTAA AATGGTTAAA GAAGAAGGCT TGGCATTAAC ATCTTGTATT TCTGGTATGT 11940 

35 TAGGTGTTAC TGAACCAGCC ATGTTCGGTG TGAACTTACC TCTGAAATAT CCATTTATCG 12000 

CTGQGATATC AACGTCTTGT GTATTGGGGG CAATCGTTGG TATGAATAAC GTACTTGGAA 12 06 0 

AAGTTGGTGT TGGTGGCGTG CCAGCATTCA TTTCAATTCA AAAAGAATTT TGGCCAGTAT 1212 0 

40 ATCTTATTGT GACAG CTATT GCTATTGTTG TACCATGTAT ACTAACAATT GTGATGTCTC 12180 

ATTTTAGTAA ACAAAAAGCG AAAGAAATTG TTGAAGATTA ATAAAATAAA AAAGGGGCGT 12240 

TCGTTATTTG GACGTCCTTT ATTACGTTAT AAGGTGGTAA TTGTGTGTCG AAAGAAATAG 12 3 00 

45 

ATTGGAGAAA ATCCGTTGTA TAT CAAATTT ATCCTAAGTC GTTTAATGAT ACGACGGGGA 123 6 0 

ATGGTATAGG AGATATCAAT GG AATTATAG AAAAATTGGA TT AT ATCAAG TTATTGGGTG 124 2 0 

TTGATTATAT TTGGTTAACA CCAGTGTATG AATCACCGAT GAATGATAAT GGCTATGATA 124 80 

50 

TCAGCAATTA TTTAGAAATC aATGAAGACT TTGGAACGAT GGATGATTTT GaAAAGTTAA 12 540 
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is 



CGACGCAGCA 


TGaATGGTTT 


AAAGAAGCCC 


GTAAATCTAA 


AGATAACCCy 


TATAGAGATT 


12660 


ATTACTTTTT 


CAGATCATCT 


GAAGACGGGC 


CGCCAACAAA 


TTGGCATTCT 


AAATTCGGTG 


12720 


GTAA7GCATG 


GAAGTATGAT 


TCTGAGACAG 


ATGAATATTA 


TTTACATTTA 


TTTGATGTCA 


12780 


GTCAAGCTGA 


TTTAAATTGG 


GATAATCCGG 


AAGTACGTCA 


ATCGTTATAT 


CGCATAGTCA 


12840 


ATCATTGGAT 


AGACTTCGGC 


GTTGATGGTT 


TTCGATTTGA 


TGTCATTAAC 


TTAATTTCTA 


12900 


AAGGTGAATT 


TAAGGACTCT 


GACAAAATAG 


GTAAAGAATT 


TTATACGGAT 


GGTCCTAGAG 


12960 


TGCATGAGTT 


TCTGCATGAA 


TTAAATCGTC 


AAACGTTTGG 


TAACACTGAC 


ATGATGACTA 


13020 


TAGGAGAAAT 


GTCTTCGACG 


A CG ATTG AAA 


ATTG T ATT AA 


GTATACACAA 


CCAGAACGCC 


13080 


AAGAATTGAA 


TAGTGTTTTT 


AATTTTCATC 


ATCTAAAGGT 


TGATTATGTT 


GATGGTGAAA 


13140 


AGTGGACAAA 


TGCGAgcTTG 


nATTTTCATA 


AGTTAAAGGA 


AATTCTGATG 


CAATGGCAAC 


13200 


GAGGTATTTA 


TGACGGTGGC 


GGATGGAACG 


CGATTTTCTG 


GTGTAATCAT 


GATCAGCCAC 


13260 


GGGTAGTGTC 


TAGATTTGGT 


GATGATACGT 


CGGAAGAGAT 


GAGGATACAA 


AGTGCTAAAA 


13320 


TGTTAGCTAT 


CGCACTGCAT 


ATGTTGCAAG 


GGACGCCATA 


TATTTACCAA 


GGTGAAGAAA 


13380 


T7GGTATGAC 


GGACCCACAT 


TTTACATCAA 


TAGCACAATA 


TCGTGATGTT 


GAATCGATTA 


13440 


ATGCCTACCA 


TCAGTTGTTA 


AGTGAAGGGC 


ATGCTGAAGC 


GGATGTGTTA 


GCGATTTTAG 


13500 


GACAGAAGTC 


ACGAGACAAT 


TCGAGAACGC 


CTATGCAATG 


GAGTGATGAT 


GTTAATGCTG 


13560 


GATTTACAGC 


TGGTAAnCCT 


TGGATTGATA 


TTTCGGAAAA 


TT AT CATC AG 


GTCAACGTTA 


13620 


GACAAGCACT 


TCAGAATAAA 


GAGTCTATTT 


TCTATACGTA 


TCAAAAATTA 


ATACAATTAA 


13680 


GACATACGCA 


T GAT ATT ATT 


ACGTATGGAG 


ACATTGTGCC 


ACGTTTTATG 


GATCATGATC 


13740 


ATTTATTTGT 


TTATGAACGT 


CATTATAAGA 


ATCAACAATG 


GCTAGTAATT 


GCGAATTTCT 


13800 


CAGCATCGGC 


TGTTGATTTG 


CCAGAAGGAT 


TGGCTAGAGA 


AGGTTGTGTT 


GTGATTCAAA 


13860 


CAGGCACAGT 


GGAAAATAAT 


ACGATAAGCG 


GG TTTGGTGC 


AATTGTAATC 


G AAAC AAA C G 


13920 


CGTAAAATAA 


ATTGAGTGGA 


TGCGTTTATA 


TGGCGAAACA 


AAAAAAGTTT 


ATGAAGATTT 


13980 


ATGAGGCGTT 


GAAAGAAGAT 


ATATTAAACG 


GGCAGATTCA 


ATATGGTGAA 


CAAATTCCGT 


14040 


CTGAACATGA 


TTTGGTGCAA 


TTGTACCAGT 


CAT CTCG AGA 






141UU 


ATTTGTTGGC 


ATT AGACGGC 


ATGATTCAAA 


AGATTCATGG 


TAAAGGGTCA 


CTTGTCATTT 


14160 


ATCAGGAGGT 


TACAGAGTTT 


CCATTTTCTG 


AACTTGTTAG 


TTTTAAAGAA 


ATGCAAGAAG 


14220 


AAATGGGCGT 


CG CAT ATTT A 


ACTGAAGTTG 


TTGTGAATGA 


GGTTGTTGAA 


GCGCATGAAG 


14230 


TTCCAGAAGT 


TCAACATGCT 


TTAAACATCA 


ATTCTAGTGA 


AT CACTCATT 


CATATTGTTA 


14340 



429 



TTGTTTCAGA TATAGGTAAT GATGTTGCGA 
TA7TAAATCT TAATATTAGT TATTCAAGTA 

^ AAGCATATCA ATTGTTTGGT GATGTATCGG 

TGTATTTAGA AAAT A CAATG CCGTTTCAAT 
TTAAATTTAA TGACTTCTCA AGACGTCGTA 

10 CTTGCAATTA ACTATTAAAA TATAGTAATA 

CGGTTCCCTG TACTCGAAAT CCGCTTTATG 
TTTTGCGAAG TCTGCCCAAA GCACGTAGTG 

15 

CCCATGAACC ATGTCAGGTC CTGACGGAAG 
AGGgTAGCCG AGATTTAGCT AACGACTTTG 
GGTGCACGGT TTTTTATTTT TTAAATATTA 

20 

TTATAGAAGC TACTTTCTTG AAGACAATTC 
AAGTAGCTTT TTTATATGTG AAGTTTGATT 
TTTTGTGTCA ATGAAAAGTA AG AAG TT AT A 

25 

AGGGGGAGTA TCTTACAATA GAATTATTAA 
TGCCTACGGA GGACATATGC AAATATATTT 

30 ATCTTTAAAT AGTATTGAAG AAAGTTTTGA 
TGCGAAAGTA AAACATTTAA GAAAATCTCC 
GAAAAATGAA AATAACGATG TCGTTGGACA 

35 TGATGATAAG ACGTATTATG GTTTGGCGAT 
TGGACAAAAA TTAGGTCGTG GCTTGGTTCA 
GTATAGTACG GTTGTTGTAG ACCATTGTTT 

40 TGCTGCTGAG CATGACATTA AATTAGAATC 

ATGGGATAAT TTGACGGATG CACCACACGG 
ATTGTTCAAT TAAGAAGTAA AGGTATTATC 

45 

GGTGCTAACT TGAATTATCA AGCCTTATAT 
GTCGTCGGAC AAGAACATGT CACGAAGACA 
TCGCATGCTT ATATTTTTAG TGGTCCGAGA 

50 

TTTGcTAAAG CAATCAACTG TCTAAATAGC 
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GTGATTCTAT TTATGATTAT TTGGAAAAGG 14 4 60 

AGTCTATTAC TTTTGAACCG TTTGATGAAC 14 520 

TGGCTTATTC AGCAACAGTT CGAAGTATTG 14 580 

ATAATATTTC AAAAC AT CTT GCAAATGAAT 1464 0 

TAAAGTAAAC AATGATATAA ATGATTTATA 14 700 

TATATCTTGC CGTGCTAGGT GGGGAGGTAG 14 760 

CGAGGCTTAA TTCCTTTGTT GAGGCCGTAT 14 820 

TTTGAAGATT TCGGTCCTAT GCAATATGAA 14 880 

CAGCATTAAG T GG AT CAT CA TATGTGCCGT 14 94 0 

GTTACGTTCG TGAATTACGT TCGATGCTTA 15000 

AACCGATTAT TAAGAGTTGA AAATATATAA 15060 

AGCGTATTAT ACGTGGAACA TGTTTGTGGG 15120 

CAAGTGAACT CGATGTGCAG TTTGAATGAT 15180 

ATTTGATGAT AAAGAAATGA TGGTGAAATG 15240 

TGAGATACGT TATGATTATT GACAATCAAA 15300 

AAGTACTTTA ACAGAGTTAG ATTATGATAA 15360 

TGATAATCCT GAAACGAGTT GGCAAGCACG 15420 

TTGCTATAAT TTTGAATTAG AAGTAATAGC 15480 

CGTTTTATTA ATTGAAGTAG AAATTAATAG 15540 

TGCCTCTTTA TCAGTTCATC CTGAATTACG 15600 

AG CAGT AGAA GAGCGTGCCA AAGCACAAGA 15660 

TGACTACTTT GAAAAGTTGG GTTATCAAAA 1572 0 

TGGTGATGCA CCGTTACTTG TAAAATATTT 15780 

AATCGTAAAA TTTCCAGAAC ATTTTTATTA 1584 0 

ATGCTATAAT GAGAGGTAAT TGTTTATGGA 15900 

CGTATGTACA GACCCCAAAG TTTCGAGGAT 1596 0 

TTGCGCAATG CGATTTCGAA AGAAAAACAG 1602 0 

GGTACGGGGA AAACGAGTAT TGCCAAAGTG 16080 

ACTGATGGAG AACCTTGTAA TGAATGTCAT 1614 0 
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AATAATGGCG TTGATGAAAT AAGAAATATT 
TCG AAA TATA AAGTTTATAT TATAGATGAG 
GCCCTTTTAA AGACGTTAGA AGAACCTCCA 
GAACCACATA AAATCCCTCC AACAATCATT 
ATTAGCCTAG ATCAAATTGT TGAACGTTTA 

10 

TGTGAAGATG AAGCCTTGGC ATTTAtcgCT 
TTAAGTATTA TGGATCAGGC TATTGCATTT 
TTGAATGTCA CAGGTAGCGT ACATGATGAA 

15 

CAAGGTGACG TACAAGCATC TTTTAAAAAA 
GTGAATCGCC TAATAAATGa TATGATTTAT 
TCTGAGAAAG ATACTGAGTA TCGAGCACTG 

20 

ATGATTGATC TTATTAATGA TACATTAGTG 
CATTTTGAAG TGTTGTTAGT AAAATTAGCT 

25 GCGAATGTAG CTGAACCAGC ACAAATTGCT 

CGTATGGAAC AGTTAGAGCA AGAACTAAAA 
CCTGTTCAAA AATCTTCGAA AAAGCCTGCG 

30 TCAATGCAAC AAATTGCAAA AGTGCTAGAT 

AAAGATCATT GGCAAGAAGT GATTGATCAT 
AGTTTATTGC AAAATTCGGA ACCTGTGGCG 

35 GAGGAAGAGA TCCATTGTGA AATCGTCAAT 

AGTGTTGTAT GTAATATCGT TAATAAAAAC 
TGGCAAAGAG TTCGAACGGA ATATTTACAA 

40 AAGCAACAAG CACAACAAAC AG A T ATTGCT 

ACTGTACATG TGATAGATGA AG AG TG AT AC 
AAAGAAACAT CATTTTATTG ATAAATATTT 

45 

GCGGTGGCGG AAACATGCAA CAAATGATGA 
CTCAAGAACA AGAAAAACTT AAAGAAGAGC 
TTGCAGTTAC TGTAACTGGT CATAAAGAAG 

50 

TAGACCCAGA CGATATTGAA ATGCTACAAG 



AGAGACAAAG TTAAATATGC AC CAAGTGAA 16 260 

GTGCACATGC TAACAACAGG TGCTTTTAAT 163 2 0 

GCACACGCTA TTTTTATATT GGCAACGACA 16380 

TCTAGGGCAC AACGTTTTGA TTTTAAAGCA 1544 0 

AAATTTGTAG qj^p^q^q^ ACAAATTGAA 16 500 

AAAGCGTCTG AAGGGGGTAT GCGTGATGCA 16 560 

GGTGATGGTA CGTTAACATT GCAAGATGCG 1662 0 

GCGTTGGATC ACTTGTTTGA TGATATTGTA 166 80 

TACCATCAGT TTATAACAGA AGGTAAAGAA 16740 

TTTGTCaGAG ATACGATTAT GAATAAAACA 16 80 0 

ATGAACTTAG AATTAGATAT GTTATATCAA 1686 0 

TCGATTCGTT TTAGTGTGAA TCAAAACGTT 16 920 

GAGCAGATTA AGGGTCAACC ACAAGTGATT 16 980 

TCATCGCCAA ACACAGATGT ATTGTTGCAA 17040 

ACACTAAAAG CACAAGGAGT GAGTGTCGCT 17100 

AGAGGCATAC AAAAATCTAA AAATG CATTT 17160 

AAAGCGAATA AGGCAGATAT CAAATTGTTG 17220 

GCCAAAAATA ATGATAAAAA ATCACTCGTT 172 30 

GCAAGTGAAG ATCACGTACT TGTGAAATTT 1734 0 

AAAGACGACG AGAAA CGT AG TAGTATAGAA 17400 

GTTAAAGTTG TTGGTGTACC ATCAGATCAA 1746 0 

AATCGTAAAA ACGAAGGCGA TGATATGCCA 1752 0 

CAAAAAGCAA AAGATCTTTT CGGTGAAGAA 17580 

ATGACAAGCG ATATAATCGT ATGTATAATG 17640 

ATTGATTTTC AAGGAGGAAA TGGAATATGC 17700 

AACAAATGCA AAAAATGCAA AAGAAAATGG 17760 

GTATTGTAGG AACAGCTGGC GGTGGCATGG 1782 0 

TTGTCGACGT TGAAATCAAA GAAGAAGCTG 1788 0 

ACTTAGTGTT AGCAGCTACT AATGAAGCGA 17940 
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TCCCTGGaAT GTGATCATAG ATGCATTATC CAGAACCTAT ATCAAAACTT ATTGATAGCT 18 060 

TTATGAAATT GCCAGGCATT GGTCCAAAGA CAGCCCAACG TCTGGCTTTT CATACCTTAG 1312 0 

ATATGAAAGA AGACGATGTT GTTCAGTTTG CCAAAGCATT AGTAGATGTT AAGAGAGAAT 13180 

TAACATATTG TAGCGTATGT GGTCACATTA CTGAAAATGA TCCATGTTAT ATTTGTGAAG 1324 0 

A7AAGCAAAG AGATCGTTCA GTTATTTGTG TTGTGGAAGA TGACAAAGAT GT CAT AG CT A 133 00 

TGGAAAAAAT GAGAGAATAC AAAGGTTTAT ATCACGTTTT ACATGGGTCT ATTTCGCCTA 18 360 

TGGATGGCAT TGGACCAGAA GATATTAATA TTCCTTCATT GATTGAACGC TTGAAAAACG 1842 0 

ATGAAGTTAG CGAATTAATC TTAGCTATGA ACCCGAACTT AGAGGGGGAA TCTACAGCCA 18480 

TGTATATTTC TAGATTAGTT AAGCCTATAG GTATCAAAGT G AC GAG ATT A GCACAAGGGT 1854 0 

TATCGGTAGG TGGCGATTTA GAGTATGCTG ACGAAGTAAC ATTATCTAAA GCAATCGCAG 18600 

GTAGAACAGA AATGTAATkT CTTCTATTAA ACATTTTTGA TTTTAATACT ATAGTAAGAA 1866 0 

20 

AAGTCACAGT GTAATCATTG TGGCTTTTTT TATGGTGTGG TGTGATGTAC TACTTTATTT 18720 

GCGGTGTGGC GGTGGTATGG TTTACCTAGT TTTACTGAGG GATGGGTAAT CTTTAGGAAG 1878 0 

CAAGCCGTTG GTTGTGATTT GTTACTTCTA ATAGTAATGA TGTGAATTGG ATTATCGAAT 18 840 

25 

TAG AT CT ATG GTTATGGTGT GTTGGTGCTA TTAATTTGAT AAATGCGGTT AATGACTATG 18900 

CAAATGAAAT TCTTTTGTAA TTGAAATGAT AGATGCTGGC TTAGTAAGTT GTACTTCTTT 18 960 

GGTCTAAAGC TTATTAAATC AGCCTGTATA GCGGTGTTTT GAGAGATTAT TTAAAACTTG 1902 0 

jO 

TAAATTTATT TTTAATTTCT GGTAAAAAAA TAACGTTCTG TTTTGCGTTT TTTTTGATTG 19080 

AT ATGG TT AG AGAAAAATCT GTTTCTTGTT CTAAAAAACG TACTATTTAT AAGTGGGGAT 1914 0 

35 TTTTTAAGTT CGATTTTTAG GATAAGGGCG TTCAGTACAG ATGACAAAGG TGTAATTTTT 19200 

ACTGTTGTTA AGCAGTTTGA AAGCCTGTAT AGTATTTATT TGTTGAGGCA AACAAAACAA 192 60 

CTCAACTTAA GAAATAACTT GAATTACTAA CGAAAATTAA TTTTAAAAAG TTATTGACTT 193 20 

■40 AAATGTTAAT AAAATGTATA ATTAATTCTT GTCGGTAAGA AAAATGAACA TTGAAAACTG 19390 

AATGACAATA TGTCAACGTT AATTCCAAAA AACGTAACTA TAAGTTACAA ACATTATTTA 19440 

GTATTTATGA GCTAATCAAA CATCATAATT TTTATGGAGA GTTTGATCCT GGCTCAGGAT 19 500 

45 GAACGCTGGC GGCGTGCCTA ATACATGCAA GTCGAGCGAA CGGACGAGAA GCTTGCTTCT 19560 

CTGATGTTAG CGGCGGACGG GTGAGTAACA CGTGGATAAC CTACCTATAA G A CTGGG AT A 19620 

ACTTCGGGAA ACCGkAGCTA ATAC CGGATA ATATTTTGAA CCGCATGGTT CAAAAGTGAA 19680 

50 

AGACGGTCTT GCTGTCACTT ATAGATGGAT CCGCGCTGCA TTAGCTAGTT GGTAAGGTAA 1974 0 
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GAGACACGGT CCAGACTCCT ACGGGAGGCA 
gCtGaCGGAG CAACGCCGCG TGAGTGATGA 

5 GGGAAGAACA TATGTGTAAG TAACTGTGCA 

GGCTAACTAC GTGCCAGCAG CCGCGGTAAT 
TGGCCGTAAA GCGCGCGTAG GCGGTTTTTT 

10 GTGGAGGGTC ATTGGAAACT GGAAAACTTG 

GTAGCGGTGA AATGCGCAGA GATATGGAGG 
TGTAACTGAC GCTGATGTGC GAAAgCGTGG 

ts 

CCACGCCGTA AACGATGAGT GCTAAGTGTT 
AACGCATTAA GCACTCCGCC TGGGGAGTAC 
GGGGACCCGC ACAAGCGGTG GAGCATGTGG 

20 

CAAATCTTGA CATCCTTTGA CAACTCTAGA 
GACAGGTGGT GCATGGTTGT CGTCAGCTCG 
CGAGCGCAAC CCTTAAGCTT AGTTGCCATC 

25 

GTGACAAACC GGAGGAAGGT GGGGATGACG 
TACACACGTG CTACAATGGA CAATACAAAG 
CATAAAGTTG TTCTCAGTTC GGATTGTAGT 

30 

CTAGTAATCG TAGATCAGCA TGCTACGGTG 
CGTCACACCA CGAGAGTTTG TAACACCCGA 

35 CGTCGAAGGT GGGACAAATG ATTGGGGTGA 

GCGQCTGGAT CACCTCCTTT CTAAGGATAT 
ATAACGTGAC ATATTGTATT CAGTTTTGAA 

40 TAAAGTGATA TTGCTTATGA AAATAAAGCA 

TACATTGAAA ACTAGATAAG TAAGTAAAAT 
AAAGAGTTTT AAATAAGCTT GAATTCATAA 

45 CACAAGATTA ATAACGCGTT TAAATCTTTT 

TGACTTATAA AAATGGTGGA AA CAT AG ATT 
GGCACTAGAA GCCGATGAAG GACGTTACTA 

so 

AGCTTTGATC CAGAGATTTC CGAATGGGGA 
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GCAGTAGGGA ATCTTCCGCA ATGGGCGAAA 19 860 

AGGTCTTCGG ATCGTAAAAC TCTGTTATTA 19 920 

CATCTTGACG GTACCTAATC AGAAAGCCAC 19 980 

ACGTAGGTGG CAAGCGTTAT CCGGAATTAT 2 0 040 

AAGTCTGATG TGAAAGCCCA CGGCTCAACC 2 0100 

AGTGCAGAAG AGGAAAGTGG AATTCCATGT 2 0160 

AACACCAGTG GCGAAGGCGA CTTTCTGG7C 2 0220 

GGATCAAACA GGATTAGATA CCCTGGTAGT 20280 

AGGGGGTTTC CGCCCCTTAG TGCTGCAGCT 20340 

GACCGCAAGt TGAAACTCAA AGGAATTGAC 2 04 00 

TTTAATTCGA AGCAACGCGA AGAACCTTAC 20460 

GATAGAGCCT TCCCCTTCGG GGGACAAAGT 20520 

TGTCGTGAGA TGTTGGGTTA AGTCCCGCAA 2 0580 

ATTAAGTTGG GCACTCTAAG TTGACTGCCG 2 064 0 

TCAAATCATC ATGCCCCTTA TGATTTGGGC 2 0700 

GGCAGCGAAA CCGCGAGGTC AAGCAAATCC 2076 0 

CTGCAACTCG ACTACATGAA GCTGGAATCG 20 8 20 

AATACGTTCC CGGGTCTTGT ACACACCGCC 20880 

AGCCGGTGGA GTAACCTTTT AGGAGCTAGC 2 0 940 

AGTCGTAACA AGGTAGCCGT ATCGGAAGGT 21000 

ATTCGGAACA TCTTCTTCAG AAGATGCGGA 21060 

TGTTTATTTA ACATTCAAAT ATTTTTTGGT 21120 

GTATGCGAGC GCTTGACTAA AAAGAAATTG 21180 

ATAGATTTTA CCAAGCAAAA CCGAGTGAAT 21240 

GAAATAATCG CTAGTGTTCG AAAGAACACT 21300 

TATAAAAGAA CGTAACTTCA TGTTAACGTT 21360 

AAGTTATTAA GGGCGCACGG TGGATGCCTT 2142 0 

AC G ACGATAT GCTTTGGGGA GCTGTAAGTA 214 80 

AACCCAGCAT GAGTTATGTC ATGTTATCGA 21540 
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GAGGAAGAGA AAGAAAATTC GATTCCCTTA GTAGCGGCGA GCGAAACGGG AAGAGCCCAA 21660 

ACCAACAAGC TTGCTTGTTG GGGTTGTAGG ACACTCTATA CGGAGTTACA AAGGACGACA 21720 

5 TTAGACGAAT CATCTGGAAA GATGAATCAA AGAAGGTAAT AATCCTGTAG TCGAAAATGT 21780 

TGTCTCTCTT GAGTGGATCC TGAGTACGAC GGAGCACGTG AAATTCCGTC GGAATCTGGG 21840 

AGGACCATCT CCTAAGGCTA AATACTCTCT AGTGACCGAT AGTGAACCAG TACCGTGAGG 21900 

10 GAAAGGTGAA AAGCACCCCG GAAGGGGAGT GAAATAGAAC CTGAAACCGT GTGCTTACAA 21960 

GTAGTCAGAG CCCGTTAATG GGTGATGGCG TGCCTTTTGT AGAATGAACC GGCGAGTTAC 22020 

GATTTGATGC AAGGTTAAGC AGTAAATGTG GAGCCGTAGC GAAAG CGAGT CTGAATAGGG 22 0 80 

15 

CGTTTAGTAT TTGGTCGTAG ACCCGAAACC AGGTGATCTA CCCTTGGTCA GG TTGAAGTT 2214 0 

CAGGTAACAC TGAATGGAGG ACCGAACCGA CTTACGTTGA AAAGTG AG CG GATGAACTGA 2 2200 

GGGTAGCGGA GAAATTCCAA TCGAACCTGG AGATAGCTGG TTCTCTCCGA AATAGCTTTA 222 6 0 

20 

GGGCTAGCCT CAAGTGATGA TTATTGGAGG TAGAGCACTG TTTGGACGAG GGGCCCCTCT 22 320 

CGGGTTACCG AATTCAGACA AACTCCGAAT GCCAATTAAT TTAACTTGGG AGT CAGAACA 22 3 80 

TGGGTGATAA GGTCCGTGTT CGAAAGGGAA ACAGC CCAGA CCACCAGCTA AGGTCCCAAA 2244 0 

25 

ATATATGTTA AGTGGAAAAG GATGTGGCGT TGCCCAGACA ACTAGGATGT TGGCTTAGAA 22 500 

GCAGCCATCA TTTAAAGAGT GCGTAATAGC TCACTAGTCG AGTGACACTG CGCCGAAAAT 2256 0 

GTACCGGGGC T AAA CAT ATT ACCGAAGCTG TGGATTGTCC TTTGGaCAAT GGtAGGAGAG 2262 0 

30 

CGTTCTAAGG GCGTTGAAGC ATGATCGTAA GGACATGTGG AGCGCTTAGA AGTGAGAATG 226 8 0 

CCGGTGTGAG TAGCGAAAGA CGGGTGAGAA TCCCGTCCAC CGATTGACTA AGGTTTCCAG 2274 0 

35 AGGAAGGCTC GTCCGCTCTG GGTTAGTCGG GTCCTAAGCT GAGGCCGACA GcGTAGGCGA 228 0 0 

TGGATAACAG GTTGATATTC CTGTACCACC TATAATCGTT TTAATCGATG GGGGGACGCA 2286 0 

t AGGAT AGGC GAAgcGTGcG ATTGGATTGC ACGTCTAAGC AGTAAGGCTG AGTATTAGGC 22920 

40 AAATCCGGTA CTCGTTAAGG CTGAGCTGTG ATGGGGAGAA GACATTGTGT CTTCGAGTCG 22980 

TTGATTTCAC ACTGCCGAGA AAAGCCTCTA GATAGAAAAT AGGTGCCCGT ACCGCAAACC 2 3040 

GACACAGGTA GTCAAGATGA GAATTCTAAG GTGAGCGAGC GAACTCTCGT TAAGGAACTC 2 3100 

45 GGCAAAATGA CCCCGTAACT TCGGGAGAAG GGGTGCTCTT TAGGGTTAAC GCCCAGAAGA 23160 

GCCGCAGTGA ATAGGCCCAA GCGACTGTTT ATCAAAAACA CAGGTCTCTG CTAAACCGTA 23220 

AGGTGATGTA TagGGcTGAC GCCTGCCCGG TGCTGGAAGG TTAAGAGGAG TGGTTAGcTT 2 3280 

50 

CTGCGAAgCT ACGAATCGAA GCCCCAGTAA ACGGCGGCCG TAACTATAAC GGTCCTAAGG 2 3 340 
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TGTCTCAACG AGAGACTCGG T G AAAT CAT A GTACCTGTGA AGATGCAGGT TACCCGCGAC 2 3460 

AGGACGGAAA GACCCCGTGG AGCTTTACTG T AGCCTG AT A TTGAAATTCG GCACAGCTTG 2 3 520 

TACAGGATAG GTAGGAGCCT TTGAAACGTG AGCGCTAGCT TACGTGGAGG CGCTGGTGGG 2 3580 

ATACTACCCT AGCTGTGTTG GCTTTCTAAC CCGCACCACT TATCGTGGTG GGAGACAGTG 23 64 0 

TCAGGCGGGC AGTTTGACTG GGGCGGTCGC CTCCTAAAAG GTAACGGAGG CGCTCAAAGG 2 3700 

TTCCCTCAGA ATGGTTGGAA ATCATTCATA GAGTGTAAAG GCATAAGGGA GCTTGACTGC 2 3 760 

GAGACCTACA AGTCGAGCAG GGTCGAAAGA CGGACTTAGT GATCCGGTGG TTCCGCATGG 2 3 820 

AAGGGCCATC GCTCAACGGA TAAAAGCTAC CCCGGGGATA ACAGGCTTAT CTCCCCCAAG 23 88 0 

AGTTCACATC GACGGGGAGG TTTGGCACCT CGATGTCGGC TCATCGCATC CTGGGGCTGT 2 3 940 

AGTCGGTCCC AAGGGTTGGg CTGTTCGCCC ATTAAAGCGG TACGCGAGCT GGGTTCAGAA 24 000 

CGTCGTGAGA CAGTTCGGTC CCTATCCGTC GTGGGCGTAG GAAATTTGAG AGGAGCTGTC 24 06 0 

CTTAGTACGA GAGGACCGGG ATGGACATAC CTCTGGTGTA CCAGTTGTCG TGCCAACGGC 24120 

ATAGCTGGGT AGCTATGTGT GGACGGGATA AGTGCTGAAA GCATCTAAGC ATGAAGCCCC 24180 

CCTCAAGATG AGATTTCCCA ACTTCGGTTA TAAGATCCCT CAAAGATGAT GAGGTTAATA 24 24 0 

GGTTCGAGGT GGAAGCATGG TGACATGTGG AGCTGACGAA TACTAATCGA TCGAAGACTT 243 00 

AATCAAAATA AATGTTTTGC GAAGCAAAAT CACTTTTACT TACTATCTAG TTTTGAATGT 243 6 0 

ATAAATTACA TTCATATGTC TGGTGACTAT AGCAAGGAGG TCACACCTGT TCCCATGCCG 244 2 0 

AACACAGAAG TTAAGCTCCT TAGCGTCGAT GGTAGTcGAA CTTACGTTCC GCTAGAGTAG 244 8 0 

AACGTTGCCA GGCAAAAAAT GGATGCGATG AGCCGCATTG AGACCGCAAG GTCTCTTTTT 24 54 0 

TTTATGTCTA AAACGTCAAA ATAAAAAGCA AACACAAAGA AAAATGGCTT GGCGAAGTGA 246 00 

AAACDTTTGA ATCTGACGAA ACGAGAAAAG ArCGCAACGA GTTTAGTAGA GCTAAATGAG 24 660 

TAAGyGAGAG CCGAAGrAGA GGAAAGAAGC AAGCGATTGT CACAAGTCAA GAAAGGTTCT 24 72 0 

TAGCGASGAT GGTAGCCAAC TTACGTTCCG CTAGAGTAGA ACTGGAAATG ATAATTTAAT 24780 

AATGTACACT TTCGATTGTC TAAGTATGTA CAACTTTAAT TTTGTGTTTA TATAAATTTA 24 840 

AAATGATATC ATCGAAAACA AAATATTGTA TAAATAGAGA AGAGCAGTAA GACGGTATCT 24 900 

AATTGAAAAT GATCTTACTG CTCTTTTATA TACTTTATTG AAATACAAAA AGGAAATTAA 24 960 

TTATTATACA ATAGACAAGC T ATTG CAT AA GTAACACTAA CTTTTATCAA AGAAGTGTTA 2 5020 

CTTTATAATT AATGATTTTA TTAGAGCGTC TACATGCGGT TTTAAAGCAT CATCGTCTAT 2 508 0 

ACCGCCAAAG CCTAATATAA ATTTAGGGGT TTT CTTAT AG TCTTGATCAT CATCAAAATT 2514 0 
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TCCATTTTTT ACTGTAATTG TAAAATGCAT 
CTCTTT3TAA GGTTTCAATC TTTTTAAAAT 

5 CATTTTATTT AAATGCCTTT CAAAACCACC 

CA7ATGAACA GGTACAGTGT TGCCTTCAAT 
AGAATAGGGT AAC A C CAT AT ATGCAACTCG 

10 ACTGATATAA ATCACTTTTT CTCCTCTTGA 

GCCGAAATAT CTAAACTCGG AATCATAATC 
TTGAGCCCAT TGTATTAATT GAGTTCGTTT 
TTGATGGGAA GGCGTTATAT ATACTATATT 
TATTCCATTA TCTTCAACTT CAATTTGTTC 
TTTGATTGGT GGATAACTAG GTTTTTCGAT 

20 

TAATTGATTT ACTAATTGTT CGGTAGATGA 
TACGCCACGA TTAGTAAATA AATAAAATGC 
AAAATGTCCT CTACGTAATT GATTTAAATG 

25 

TCTGAAAAGT TCTATAGGGA AATGTTTCGT 
ATAAGCTTCA TCACTCGOTT TTGG TTT AT A 

J0 TTGATTGTTT AAAATTGTTA AAGATTCAAT 

TGAATAAATG TAACCTTCGT CTAATAGAAG 
AATAGATAAA TG TTTGCTT A ATTGTCTTTT 

3$ ACCTTCAATT ATTTGTTTTT TTAATTTTTC 

TTTXATAACT GACCTCCTAA ATTTATCTTA 
ATTACAATGT ATTTAATCAA CTTGAAAAGG 

JO ATCAGACAGA GTCAAAAGAG GTATGGCTGA 

CGTTAATGCT GAGCAAGCAA GAATTG CAGA 
AGAACGAGTA C CTTCTG AT A TTAGAGCTGC 

45 AATTGTAGAA GAAGTAATGA ATGCTGTTTC 

TCATATCACT GAAGCAAGAG TATTAGAGGC 
AGTGTTAACA CCAGCAGATG AGGAATATCA 

50 

TGTATGTGGA TGTCGTAATT TAGGTGAAgm 



EP 0 786 519 A2 

ACCCGTTTCA GCACCTTGAA TATCAAGCTG 2 5260 

ATAGGTTAGT TTT CT A CG AT AAATTCGTCT 2 5320 

GGAAGATATA AACGTTGCAA TAAGGTTTTG 2 53 30 

GTGATTTTGA GAATGATATT TTTTCATTAT 2 5440 

ACAGCTAGGA AAAATAGACT TTGAAAATGT 25500 

ATATAGACCT TGAATTGCTG GAATGGGTTT 2 5 560 

ATCTTCTATA ATAAATCGTT CTTCTTTTTC 2 5620 

TTTTAAGTCC ATCACATATC CAGTTGGAAA 2 56 BO 

TTTTTGTGAT TTAATAACTT CATCTACGTT 2 5740 

ATATTCAACT TGTTTTTTAT CTAAAATATT 25800 

AATAAATGTT GAAGTATAAA GTAAATCGAC 2 5860 

GCCAATTATA ATTTGATTAG GATCACAAAT 25920 

CAGTTGAAA C CGCAAATGTA ATTCTCCTTG 25 98 0 

ATTTGTATCA TAAAGATCTT TGGAATACTT 26 040 

ATCTATTTCA TCCAAATTAA AAGCATAATC 2610 0 

TGAATCATCA TCAAAAAGAG AGGGGATAGG 26160 

TTCGGACACA AAATATCCAG AGCGAGGTCT 2 6220 

TTGATATGCA TGCTCTACGG TTGTTTGGCT 262 80 

AGAATAAAAT TTATCGCCTT CTTTAAATTG 2634 0 

ATAAAGTTGA TGGTATAAAG TGTTTTTCAA 26400 

TTTTGTACCT TTTTAAATAT CAGTTTATAC 2 6460 

GGTTTTATGT ATAATGAGTA AAATTATTGG 26520 

AATGCAAAAA GGCGGCGTTA TTATGGATGT 265 8 0 

AGAAGCTGGC GCGGTAgCAG TTATGGCATT 26640 

TGGTGGTGTT GCACGTATGG CAAACCCTAA 2670 0 

TATTCCAGTC ATGGCTAAAG CACGTATTGG 26 76 0 

GATGGGTGTT GACTATATTG ATGAATCAGA 26 82 0 

CTTAAGAAAA GATCAATTTA CAGTACCATT 2688 0 

TGCGCGTAGA ATTGGTGAAG GTGCTGCTAT 26940 
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ACAAGTTAAT TCAGAAGTTA GTCGATTGAC TGTAATGAAT GATGATGAGA TTATGACTTT 2 7 060 

TGCGAAAGAT ATCGGTGCGC CTTATGAAAT TTTAAAACAA ATTAAAGACA ATGGTCGTTT 27120 

ACCGGTAGTT AACTTTGCAG CTGGTGGCGT TGCGACTCCT CAAGATGCTG CTTTAATGAT 27130 

GGAATTAGGT GCTGACGGTG TATTCGTTGG ATCAGGTATT TTTAAATCAG AAGATCCAGA 2 7 240 

AAAATTTGCT AAAGCAATTG TTCAAGCAAC AACACATTAC CAAGACTATG AACTAATTGG 2 73 00 

AAGATTAGCA AGTGAACTTG GCACTGCTAT GAAAGGTTTA GATATCAATC AATTATCATT 2 7360 

AGAAGAACGT ATGCAAGAGC GTGGTTGGTA AGATATGAAA ATAGGTGTAT TAGCATTACA 2 7420 

AGGTGCAGTA CGTGAACATA TTAGACATAT TGAATTAAGT GGTCATGAAG GTATTGCAGT 2 7480 

TAAAAAAGTT GAACAATTAG AAGAAATCGA GGGCTTAATA TTACCTGGTG GCGAGTCTAC 2 7540 

AACGTTACGT CGATTAATGA ATTTATATGG ATTTAAAGAG GCTTTACAAA ATTCAACTTT 2 7600 

ACCTATGTTT GGTACATGCG CAGGATTAAT AGTTCTAGCG CAAGATATAG TTGGTGAAGA 2 766 0 

AGGATACCTT AACAAGTTGA ATATTACTGT ACAACGAAAC TCATTCGGTA GACAAGTTGA 2 7720 

CAGCTTTGAA ACAGAA7TAG ATATTAAAGG TATCGCTACA GATATTGAAG GTGTCTTTAT 2778 0 

AAGAGCCCCA CATATTGAAA AAGTAGGTCA AGGCGTAGAT ATCCTATGTA AGGTTAATGA 2 7 840 

GAAAATTGTA GCTGTTCAGC AAGGTAAATA TTTAGGCGTA TCATTCCATC CTGAATTAAC 27900 

AGATGACTAT AGAGTAACTG ATTACTTTAT TAATCATATT GTAAAaAAAG CATAGCTTAA 27 960 

TGTATGCTAA ATCAACGAAT TATTGATATT TATAGATTTG TTGAGAAGAA AATATCTCCT 2 8020 

TCAAACTTAG CTTTGGAGGA GTTATTTTTT ATGTCAAAAT TAAAAATGAT AAAAAATAAA 2 8080 

GCTATACATA AGAAAAAAAC CCTTCAAAGA GACTGAGAAT AGT CAAAATT TTGAAGGGGT 2 814 0 

TAATTCGATG TTGATGTATT TGTTAAATAA AGAATCcAGC GATTGCAGCT GAAATGAAAG 2 8200 

ATACTAGTGT tGCACCGAAT AATAATTTCA AACCAAAGCG GGCAACTGTA TCTCCTTTTT 2 8260 

TGTCATTAAG TGATTTAATC GCACCTGAAA TAATACCGAT AGAGCTAAAG TTAGCAAATG 28320 

ATACTAAGAA TACAGATGTA ACACCTTTTG CGTGTTCAGA TAAATCACTA AGTTTACCAA 2 83 80 

GTGCTTGCAT TGCTACAAAT TCGTTAGATA ATAGTTTTGT CGCCATAACT GAACCGGCTT 2 8440 

GAACTGCATC TTGCCATGGC ACACCGACTA AGAATGCAAA TGGTGCAAAG ACAAAACCAA 2 8500 

TTAATGTTTG GAAATCCCAA GAAATAGCGC CACCTGAAAC TGTACTAAAG ATATTGCTTA 28560 

CAATTCCATT TAATAGAGCG ATAATGGCAA TGTATCCGAT TAACATTGCG CCTACAATGA 2 8620 

CAG CTACTTT AAATCCATCT AAAATATATT CTCCTAGCAT TTCGAAGAAT GATTGTTGTC 2 86 BO 

TTTCTTCAGT TTCTTCAACT AATAATTTGT CATCTTCTTC ATTAACTTTA TAAGGGTTAA 2 8740 
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is 



so 



TAGGTTCAAT TAAGGTAAAG TATGCACCGA TAATTGAAGC AGAAACAGTC GACATTGCTG 2 3 860 

AAGCTGTTAA TGTGTATAAA CGTTGCTTAG GTATGTATGG TAATTGTTTT TTAATTGAAA 2 3920 

TAAATACTTC AGATTGTCCC AAAATTGCTG CAGCAACTGC ATTGTATGAT TCTAAACGTC 2 9 980 

CCATACCATT AATTTTAGAA ATTAAGAATC CTAAAACATT AATGATTAAA GGTAAAATCT 2 9040 

TTGTGTATTG AAGGATACCG ATAATCGCTG AAATAAATAC GATAGGTAAT AATACACTGA 291C0 

AGAAGAATGG TGGTTGCTTA GGATCGATAT ATTGAATACC ACCGAATACA AAGTTAACAC 2 9160 

CATCTGCTGC TTTTAATAAT AAGTAGTTAA AACCGTTTGA AATACCACCA ATAACCTTGA 2 9220 

TTCCCATTGT AGTTTTAAGC AAGATAAATG CAAAGATAAG CTGAATTGCA AGTAAAATTC 2 92 30 

CTACATATTT CCAGCGAATA TTTTT CCTGT CTGAGCTAAA TAGAAACGCA AGTGCTAAAA 2 9340 

AGAAGATAAT TCCGATAATC CCAATTAGAA T ATG CAT AT A TTTCTCATTC CTTTAGTTTT 2 94 00 

TTCTACaATc TATCATACAA TAAAATGGAA GGGCTAACAT CATAAATTTT TGAAAATATA 2 9460 

AAAACAAATT AATTGAAAAA GGTCAAAATA GOT CAT AT AA TATAGTCAAA GAAGGTCAAA 2 9520 

AAGGGGTGAT ATACATGCAC AATATGTCTG ACATCATAGA ACAATAaTCA AACGTTTATT 2 9580 

TGAAGAGTCG AATGAAGATG TCGTTGAAAT TCAGAGAGCG AATATCGCAC AGCGTTTTGA 29640 

TTGCGTACCA TCACAATTAA ATTATGTAAT CAAAACACGA TTCACTAATG AACATGGTTA 29700 

TGAAATCGAA AGTAAACGTG GTGGTGGTGG TTACATCCGA ATCACTAAAA TTGAAAATAA 2976 0 

AGATGCAACA GGTTATATTA ATCATTTGCT TCAGCTGATT GGACCTTCTA TTTCTCAACA 2 9820 

ACAAG CTT AT TATATTATTG ATGGGCTTTT AGATAAAATG TTAATAAATG AACGTGAAGC 2 9880 

TAAAATGATT CAAGCAGTTA TTGATAGAGA AACGCTATCA ATGGATATGG TTTCTAGAGA 2 9940 

TATTATTAGA GCAAATATTT TAAAACGTTT GTTACCAGTT ATAAATTATT ACTAAATGAA 30 0 00 

ATGAGGTGTT GAAGTGCTTT GTGAAAATTG TCAACTTAAT GAAGCGGAAT TAAAAGTTAA 30 060 

AG TTA CAAGT AAAAATAAAA CAGAAGAAAA AATGGTGTGT CAAACTTGTG CTGAGGGGCA 3 0120 

CCATCCGTGG AATCAAGCTA ATGAACAACC TGAaTATCAA GAACAT CAAG ATAATTTCGA 30180 

AGAAGCATTT GTTGTTAAGC AAATTTTACA ACATTTAGCT ACGAAACATG GAATTAATTT 3024 0 

TCAAGA 3024 6 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14333 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

TATTCCCCCA TCGGTTTATT AAATCGTCCA TTTCAATACT GTTTTTCCCC AAGATGTCGA 6 0 

5 

TAAATCCATT TCAAACGCTT GGACGATATC TTGCATCGTA CATACATTAA TTTCATGTCC 12 0 

TTTTAATAAT GCTAACTTTT CAACTATGTC TGGGTACTTA CGATATAAAT CAACAACTTG 180 

CTCAAAATCT TTAGAGCCGC IT CG ACT ACT ACCAATCAAC GTTAATCCTT TTTCAAGTAC 24 0 

w 

TAATCGTGTA TTCACTTCCA CGGGTAATTC ACTTACGCCT AACAAAGCAA TACTGCCTTC 3 00 

TGGTGAAATA TGTTCAACTA TTTGTTGAAG TGCAACTTGA CTTCCTTTAC CTCCAACACA 360 

;5 TTCAAATGCA TGATCAATTT TAAGATCATC TGGTATTTGA TTTACTGTAA AGATGTCATC 420 

TACAAATGAA AAATGACTTA ATTTATAGTC TGTCTTACCA AATACATAAG TTTTAGCTTC 4 80 

TGGGTACAAC TTACGTAGCA AAATAGCAGT AATATAACCT AAGTTACCAT CACCCCAAAT 54 0 

20 ACCAAAGCTG GTTTTCAAAG GTATAGATTT ACGTTCAAAT CGTTGTATAG CATGATAACT 600 

TACTGACACT AACTCTGTGT ATGAAATCGT ACTCAAATCA ATGTCATTAG GCAGCGGAAC 660 

GATACGATCA TGTGCCATCA CAACGTAGTC TTGCATAAAA CCATCATAAC CACTAGATCT 72 0 

25 AAAATAACTA GAGGCTAAGT AATTCTCCGC AATAATATGA TGTTGCTCTG TAGGTGTATT 730 

CGGTACCATT ACTACTTTCG TACCTTTTTC AAATACCCCT TTACTATCAA ATACAACTTC 84 0 

ACCAACAGCT TCATGAACTA ATGACATTGG TAATTTTTTG CGTAGTACAT TTTCATCTCT 900 

30 

TCGACCTGTG TAATACCTTT GATCAGCTGC ACAAATAGAC AAGTATAAAG GTCTTACGAT 96 0 

GACATGATTA CCATAAATAT CAACATTATT ATATGTGACG TCGAACTGTC TCGGTGCAAC 1020 

GAGTTGATAT AC TTG ATTAA TCATCGGCAA TATCACCTTG AATAATGGCA TTTGCTACTT 1080 

35 

TTAAATCATA CGGTGTTGTC ACTTTAATGT TGTATAGTTC TCCaCGTACC AATTTAACTG 114 0 

CATQTCCAGA TTCGACAATG ATTTTACATG CATCTGATAA GATTTCTTTT TGTTCACTAC 1200 

40 TTAAGGCGCG ATAACTATCT TGTAATAATT TAATATTAAA TGATTGTGGT GTTTGGCCTT 1260 

GATACATTTC ATTCCTTACA GGGATACTGT GTATGTTCTG TTTATCTTTA GACATTACAA 132 0 

TCGTATCAAT TGCTTCAATG ACTGTATCTA CTGCACCATA TTTTGCTGCT ACTTCAATGT 1380 

45 TCTCTTTAAT AATACGTTGA GTTAAAAATG GTCTTACGGC ATCATGAGTT ACAATCACAT 144 0 

CATCATTATT AATTCCATTT ACATTGCGAA TATGGTCGAT AATGTTCATA ATTGTTTCAT 1500 

TTCGATCCGT ACCACCTGCA ACTACTTTGA CACGTTGATC TGTAATGTTA TATTTTTTTA 1560 

50 

AAATATCCTG TGTATGGGAA ATCCACTGTG CTGGCGTTGC GATAATAATC TCATTAAATT 162 0 

CACTCACTAA AATGAACTTC TCAATTGTAT GGATTAAAAT CGGTTTATTA TCAATATCTA 16 80 
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C7GCATAAAT CATGTTGTCC TCCATTCTGT CATTACATCA TTTCCATTTA TACATTAC7G 13 00 

ACCTATGCCC GCACATAAGC CTAACCTATT GCTCACTTGC CTCTTTTATT AATCCAAAGA 1360 

s 

TAGTTGTCAC AATAGTGTGA TAATTTTTTA TAAAAATGTA TTTTTGTAAC TGACCATTCT 1920 

AAGTTGTTTT GCCATGCAGT TAATCATTAA CTCTGACGAT ATTAAATTGT TAAAGGTATT 19 80 

AATGTTTACT CTTTTTCAAA TTCATTATTA CTGCCATCAT TTTACCATAT ATTATAATAA 2 04 0 

w 

ATTTATCTTA TTAAGTGGCT GTACTTGATT TTCACTTTAA AAATT AT CAA ATATTGCCAT 2100 

CTCATTTTAA GTATACAAAA TGCAAAACAA CCGATTCACA AGCATATTTC ACACAAGTAA 2160 

15 ACCGGCTATT TATCAACGTA TATTCGAAGA TGAATTATTT CGATAGTATC TATAGACCAG 2220 

ACGGCATTCG CACTTTCATA GCTATAACTA TACCAGCGTT TTCGTCCTCA AAGGTGCATA 22 80 

CTAATAAATC GTAAACATGA CTTTATCAAA TCGTTCTTTC TTGTTAACTA ATTTATCAAA 2 34 0 

20 TGTCTCCGGG CCTTTTTCTA ACGGTAAAAA ATGAGAAATA ATAGGCTTTA CATTAATATC 24 00 

TTTCGTCTTC ATATAATGTA AGGTTGCCGT CCACTCTTTG CCCGGAAAAT TACTGGACAA 24 60 

ACAGTTCCAA GAGCCACATA CTGTCAACTC GTTACGCAGA ATTTTTTCAA AATGAACGCG 2 52 0 

25 ATCAATCTCA ATATCATCAT ATGGTATTCC GAGTAATACC ACCTCGCCAC CTTTTTTAGG 2 58 0 

TAGCGTCAAT ATTTGACCAA TCGTAACTTT AGCACCTGAT GATTCTATAG CTAAATCGAT 2 64 0 

TTGATTGGCG TAATGATTTT CGATGAATTT CTCAAGATTT TCTTCTTTTG AATTGATTGT 2 700 

30 

TTGATGTGCG CCCAATGATG TTGCAATATC TAGTTTATGC GCATCTATAT CTATAGCGAT 276 0 

GATATGTGCA GCACCAAATA TTCGTGCCCA TTGAATAGCT AACAAACCTA TACTGCCACA 2320 

CCCCATTACT GCAACAGTCA TACCAGGTTG TATATTCGAT TTATAAAACC CATGCGCAAC 28 8 0 

35 

AACGGCTGAT GGCTCAACCA TTGCTGCTTC AATGTAATCA ACATTGTCTG GAACCTTTAA 2 94 0 

AACATTTTGC GCTGGCAATT TGACATATTC CGCGAACGAT CCAGGTTCAT ATGAGCCAAT 3 0 00 

40 GACGAATAAC TTTTCACATC GTGCATATTC ACCTTTTAAA CAATACTCGC ATTGATAACA 3 060 

AGGTATTGCT GGGCAACCTG TCACTTTGTC GCCCACATTA ACATGCGTAA CATCACTTCC 3120 

AATGGCATCT ACTACACCTG AAAATTCATG ACCAAATGGC ATACCTTTAA TGTATGGCCC 3180 

45 CATTTTTTTG TATCGTGACG TGTCTGAACC ACATATGCCA GTCGCTCGTA CTTTAATAAT 324 0 

AACGTCATTC GCACTTTCAA TGACTGGCTT TT C ATT AT CC TCATACCGTA AATCTTCCAC 3 3 00 

GCCATATAAT TTCAATGCTT TCACTTGTAA ATCACCTCAA ATTTGATTTA ATTCACAACT 33 6 0 

SO 

TTTTTCTTTT TAAAAATACC TGTCGCAAAA TAACCTGCAA TGACAATGGA ATTACTTACG 34 2 0 

AGTAAATGTT CCATATAAAA AT CAGTG ATT TGTCTTAATG GCCCAAGCAT AAAAGTTAGC 34 8 0 
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35 



40 



SO 



TGCTTTAATA 


CCTTCGCCGG 


ATTTTAAATG 


TTGATACGCC 


TCGTCCCATT 


TCGAAATATC 


3600 


AT AT AT TTTT 


GTCAC CAAAG 


CTTCAGCATT 


TACTAAACCA 


TCCGCCATAA 


GTTGCAATGA 


3660 


AGGTTCCCAA 


TCTGCTGGCT 


TTTGACTTCT 


ACTACCAACA 


ACTGTTATTT 


CTTTTTGAAT 


3720 


CACTTTTTCC 


ATATCAAATG 


GAATTTCAGC 


ATCCTTAAAA 


ATACCTATTT 


GACTGTAGAA 


3780 


ACCTTTTTTG 


CGTAAAATAT 


CCAAACCTTG 


TCGTGCTGCT 


GGAACTGCAC 


CTGAACATTC 


3340 


AACAACAACA 


TCTGCACCGT 


AACCGTCTGT 


AATTCCATTG 


ATATACGTTT 


TTAAGTCTGT 


3900 


TTGTTGTAAA 


TTGACTACAT 


AATCCATGTG 


CAATGCTTCT 


GCTTTATCTA 


AT CTG ACTTT 


3960 


GTCATTGTCC 


AATCCAGTTA 


CCACAACAGT 


TGCGCCTTTA 


CTTTTTAACA 


CTTGTGCTAC 


4020 


AAGTAATCCG 


ATTGGCCCAG 


GTCCCATTAC 


AACTGCTACA 


TCGCCTGAAT 


TGACTTGAAT 


4080 


CTTAGAAACG 


CCATGATGTG 


CACATGCTAA 


TGGTTCTGTC 


ATAGCTGCAG 


ACTGATACGA 


4140 


TAtTCGTCTG 


GAATATGATG 


CAAACTTTCT 


TCACGTGCAA 


TGACATAATT 


AGTAAATGCG 


4200 


CCATCAACTT 


GTGTTCCAAT 


ACCTTTTCGA 


TGGTTGCATA 


AATTATAGTC 


TTTTGATTTA 


4260 


CAGTATTCAC 


ACTCATTACA 


AACATAGAAT 


GTCGTTTCAG 


aTGtGACACG 


GTCACCAACT 


4320 


TTAAAATCTT 


TAACGTCTGC 


TCCAACTTCA 


ACGATTTCAC 


CAGAAAATTC 


ATGACCTAAT 


4380 


GTCACTGGAA 


AATTAACTTT 


ATAATGACCT 


TCATAAGTAT 


GAATATCTGT 


GCCACAAATT 


4440 


CCTGCATAAT 


GTACTTTAAT 


CTTTACTTTA 


TCATCTAGCG 


GTGTTGCAAC 


TT CT TT AT C A 


4500 


AGAAGTTCTA 


AGTTGCCATG 


TCCTTCTCTT 


GTTTTTACTA 


AAGCTTTCAC 


CACAAACACC 


4560 


TCGATTTTTA 


ATTGAATAGA 


CTAAATAGTT 


TAAAGATAAG 


ATAGTTAACG 


AT ATT A C CAC 


4620 


CTTGATCAAT 


ACTTGAAATT 


TCAGATGAAC 


CTTTTGGCAT 


TTGTACATTC 


GTACCTTTCG 


4680 


CCATATCTGT 


GAAAATGGGT 


GCTACGTCTG 


TTGCAATATA 


TAGTGAAATT 


GCAATCATAA 


4740 


TCGTACCCAC 


AATGACAGAA 


TGAATAATGT 


TTCCTCTTGC 


TGCACCAACA 


ATAAACGCGA 


4800 


CAACAAATGG 


TATCGTTGCT 


AAGTCACCAA 


AAGGTAGTAC 


TTGGTTTCCT 


GGTAAAATAA 


4860 


CGGCTAATAA 


AACAGTGATA 


GGTACTAAAA 


TTAATGCTGT 


CGAAATAACT 


G CTGGATG A C 


4920 


CTAATG CT AC 


AGCCGCATCC 


AATCCAATAT 


AAATTTCACG 


TTCGCCAAAA 


CGTTTATTTA 


4980 


GCCATGTTCT 


TGCAGACTCT 


GAAACTGGCA 


TTAAAPCTTP 


C ATT A AH ATT 


TTTirP^TTP 
1 l l /\v_ L^rv 1 iv. 




TAGGCATTAA 


TACCATTACT 


GCAGCCATTG 


ACATTCCTAA 


ATTAATGATG 


TCTCCAGGTT 


5100 


TGTAACCTGC 


TAACACACCA 


ATACCTAAAC 


CTAAAATTAA 


GCCGACAAAT 


ATAGACTCTC 


5160 


CAAATGCGCC 


AAAACGTTTT 


TGAATTGTTT 


CAGGATCAGC 


ATCTAACTTA 


TTCAGACCGG 


5220 


G T A CTTTTTG 


TAACAATTTA 


ACTAAGTAAA 


TACCTGGTGC 


ATAAGAAATT 


GTACTTCCTG 


5280 
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CTACTTTCAA ACAGATAATT TGGAAAATAA CTGCTGCTAA TAACGCTTGC CAAATACTGC 54 00 

CTGATACGGC ATAAACCATT GCTGCTGTAA ACGTATAATG CCAAAAATTC CAAAT AT CTA 54 6 0 

5 

CATTCATCGT CTTTGTCACT TTAGTTACTA GCAATACAAC GTTAACTATG ATTCCGAGTG 552 0 

GAATAATAAA TGCTGCGACA GATGATGCCC AAGCGATAGA TGATGTTGCT GGCCAACCTA 5580 

CATCAATCAC ATTCAGACTG ACG CCTAAAT TTTTAACCAT CGCTTGTGCT GCTGGCCCTA 564 0 

10 

AATTTTTAAC TAATAAATCG ATGACTAAGA AAATCCCTAC AAAAGCCACA CCTATTGTTA 5700 

AACCA3ACCT AAATGCCGCT CCAATTTTCT GCCTAAAGAA TAGGCCAAGC AAGAATATGA 5760 

/5 CAACCGGTAA AATAACAGTt GCACCTAAAT CTAAAAATCC CCTTACAAAA TCAGTGAAGT 5820 

AACTCATATT TAAACCCTCC CTGTTATATA TGCATTGTCA CGATACTTTC CGATTGTGAT 5880 

TACATTTGAC GTTACAGTCA TTTCAACGAC AACCCTTGCT AAATTCGACT GCAGTCCTTT 5 94 0 

20 TGAATTACAG tCACTGCGTT TCTATGTCAT CAACAATCAT TTGTCGTGAT AG T CATTT AT 6 000 

ATGCAATTTG CATATATTAA TATGTTATCG ACCCACGTTA CATATCAATT CCGTTATTTT 6 060 

TGTAACTCTG TTAAGATTTG TTGTTTTGTT TCTTCAATAC CAATACCAGT TAAGAAATTA 612 0 

25 

CGTGCGTTGA TAACTGGGAA TTTATATTCT TTTTTTG TC A TTGCAGTTGT AACTAATAAA 618 0 

TCTGCAGTGT CTTCATAAGG TCCAACTTCT GTAATTTTGA TTTGTTTAAT ATCTACTTTA 6 24 0 

ATATTGTGTT CCTTTGCCAT TTCTTCAATT GCATTATTTA CTACTGTTGA CGTTGCAATA 6300 

30 

CCTGCACCAC ACGCTACTAA TACTTGTTTC ATTTTCAATT CCTCCAATTA ATTTTTAGTT 6 36 0 

AT ATT CC AAA TAATCATTGA TTAGTGTTGC TAAAATTGTT TCATCTTTCG TTCGTAGAAT 64 20 

CTGCTCCAAT TTTTCTTCAC TTTGAAAAAT TTGCATCAAC TGTTGTAACA GCTTAAGTTG 64 80 

35 

ATCATCTACT TTATCCATTG CTAACATAAA AACGATTTTC ACTTCTGTCT GTTGATCAAG 6 540 

TGTTCCCATT TCAATAAACG GCACTTCTTT TTCTAGAACA GCCACACCTA TCGTTCTATG 6 60 0 

40 GTTAATATGT TCGACATCTG TATGCGGTAT AGCGACCGAA CATAGATGCG TTGGTAAACC 6 66 0 

AGTAGCAAAT TCTTTTTCTC TGTCGATGAC TGCATCTTTA AACGTTGACT TCACGAACCC 672 0 

ATTTTGAAAT AACACATCTG ACATTTGTGA CAATACGGAT TCTTTATCAG TTGCCGACAA 6780 

45 ATTGAGCATT ATATTTTCTT TATGCACTAA TTGCTGTCCC ATCCATTTTC CCTCGCTTCT 684 0 

TTATTTGAAT AATTTTTTAA AATCT CATTT ACATCAGAAT TTTTGCGACT TTGTATGATG 6 900 

CGCTTAATTG CGTCATTGTC TTGCGCCACA TCTCTCAATT GTAGTAACGC TCTTAAGTGT 6 960 

SO 

GTCACTTTAT CAACAGCAGC AATAGGTACA ATAATATGGA TTGCTGTGCC ATCTGACATG 702 0 

TATATTGGTT CTTGTAATAT CAACATACTC ATCGCTGTTT TATGTACATG CTTTTCAGAG 7 0 BO 
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25 



TGCATCTCAT GAATATATTT AATATCAATA AAATGATTAG CAACTAACAC ATCACTTGCT 72 00 

TTAGCAATAG CTTCATCAAT ATTTTCAACA TGATGCATTC TTTTCACGTG CCTTGCCGGT 726 0 

5 

ATCAAGTCAG CTAAATCTAA TGyCTwATTT tGTGtGACaA TCGATCCATT AA7GGTTGAA 73 20 

ATTGAATTAT AATTGGCAAT AAAATCTTCT AAACCATCAC GTAGTcTGTA ATGTCATTAA 7 3 80 

CTGTCGTTGT GCGTTCAATT AATGCCATTA ACTTGTTTAT TTCCT7ATCA ATGTCAGCCG 74 4 0 

w 

ATTCCTTATT AATGTACTTC ATCACTTCTT TACGTAACTT TCGTTGCTCA TTTT C AG AT A 7 5 00 

AAGCTACTTT TGTGATAAAT AATTTTTTA? GTGTTAGGAC AAACATTGGT GAAAAGACGA 7560 

,5 TGTCATAATC TAATGTGTAA TTTTCAAATG TTCTAAGTGA AATCGCATCT AAGAAAATAA 7620 

TTTCTGOAAA TAAGTTTCGC AACTCGTATA ACATCATTTG TGATACTGAC GTGCCTTGTG 76 80 

TACACACGAT AATAGCTTTT ATCTTGCCAT CGAAGTTTTC ATCTTGACGT CTCAAACTAC 774 0 

20 CTCCGAACAA CATGGTTAAA TATGCTATTT CATTATCAGG CAACGATTTT CCGAAATATT 7 800 

CAGTTAACGA TTGACATGAT TGTTTCACCA TATGAAATAA GGATTGATAA TTTCCTTGTA 78 6 0 

AAGGATTTAT TAATTCATCA CGATCCGTTA AGTTATATTT AATCCTATAA AAAGCAGGCG 7 92 0 

TTAAATGTAA CAAGAGTTGC TGTGATAATT TCTCCTTATC TTCAATGTTA AT AAAAG T G A 7 9 80 

TTTGTTCAAA ATGGTGAATC ATTTGAGCGA TGGCCATCGT TAAATTCGAT ATGCTATCTG 804 0 

ATTCTTGCAA ATCAGTCCAT TGCACACTTG TTGAAAGTAA GTGTAATGTC AAATATAACT 8 100 

30 

TTTCCGCTTC TGGCAAATCC GGCTCATGTT GCGTCATAAT CTCCGTTGCT TGATATTCTT 8160 

TCGTATCCCT CAAATACTGA TAATTAATAT TTAATGGATT CATCACATGA CCACTTTGAA 822 0 

TTCGTCTACG AATCACACAA AGGACATAAG GCAATGAACT AAGTGATTTG TCTATAAAGC B28 0 

35 

GACTCTTCAA AAATTGTTCT AC CTGTTTG A TCTTG TCTTT TTGATATGCG ATATCTTCGA 834 0 

ATGtTAAGTT GAGCGCCTTT AAAACTTCAC TTTTAGTAAT ATCATGATTC AACCTTTGAT 84 00 

40 CAATCAACTT AATGAAGAAA CGGCGAACTT CAAATTCATC ACCAACAATT TCATAACCAT 84 60 

GTTTTCGAGA ATACTTAAGT GACAAACCAT GATTTTCCAA TTGCTCTTTC ACATGATTTA 8520 

TATCGTGAAT GACAGTATTT TTACTGACTT GTAAATCAAT TGAAAAATGG TTTAGAGACA 8580 

45 TTGCGTTTTC CTTACTAAAA AGCATGAGCA TTAAATAATA ACGACGTGTT TCTATGCTAA 364 0 

AAATGACATT GTTGCCGTTT AACATTTGCT GCTCCGATAC ATCTCGCTTG AATAACGTCA 87 00 

TGATTTCAGA ACTTACAATA AAATTTCCTT GGCTTGTTCT TTCAAGTTTT GGATAACCCT 87 6 0 

SO 

CTTGTTCAAG CCACAAATTG ATTTTTTGAA TGCGATATCC TAGTTGTCTA CGAGACAAAC 8 82 0 

CAAATATCGA TTCAAGTTCT TTACCATGAA TAGTAGGATT CAATACAATT TCTCTGAGTA 8 880 
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TCAATCGTCA CACCGATGTA CACACTTTGA A CAC AT ATTT TCAAAATGAG CATGTACATC 900 0 

ATTGTGATGT TTTAACAACA TTTCAATTAT ATCTATATTT TTTGTGATTT TAATCTTTTA 9060 

5 

AAATAAAGCA ATTGAAATTT TTGCATATAT TTTTGTGTTT TGTGTTTTTT TGAAGCATTT 912 0 

TTAACATACA TATCTCAATC ATTATCAAAT TGTCATGACC ATTGTAACCC AATACAAAAA 9180 

CCCTAAGGAC GCTTATATCA GGCGCCTTAG GGTTAACTGT AT CT ATTT AA TTAAGTATTA 924 0 

w 

TTATTCGTAT GTACGTAACT TATGGTCTAT CAAGTTCCAC ACTTCTTCAA CATCAACTGC 9300 

TGTAGCAAAA TAAGCATTGG CAGGCTTACC TGTAACATGA TTTAAATCGA CAGCCATAGT 936 0 

GCCATAAGTT AGTGGACTTT GATGTTCAAT GTCGATATTA ACGGGTACCA TTGTAAACAA 94 20 

TTCTGGTTGT AACAAATACA AAATTGTACA AGCATCATGT ATTGGACCAC CATC CAT ATT 94 80 

AAAGTGAGTC TTGTATGTCT TCTTAAAGAA TTG CAATAAT TCTACGACGA ACTGTGCAAC 954 0 

20 AGGATTATTG ATACTTTCAA AGCGTTCAAT CACGTGATCG TCGGCTAAAA CTTGATGTGT 9600 

TACATCTAAA CCAAACACAT TTATAGTAAT CCCACTTTCA AAAACACGCT TCGCTGCTTC 9 66 0 

AGCATCTACC CAAATATTGA ATTCTGCTGT AGGCGTCCAA TTTCCAAATG TACCACCACC 9720 

25 CATCAAAGTA ATAGATTCAA TATGCTCAGC GATTCTTGGC TCACGAATCA ATGCCGTTGC 97 8 0 

TACATTCGTA AGAGGAC CTG TCGCTACAAT TGTTACAGGT GTATCACTCG TCATCACTTT 984 0 

GTTTATAATC ACATCTGATG CTGGCATTGC AACTG CTTG A CGTGATGGTG TCGACGGTAG 9 900 

30 

TTTCGGACCA TCTAATCCAG ATTCCCCATG TATTTCAGAA GCAAAGGCAG CTGGTTTAAT 9 96 0 

TAACGGCCTA TCCGCACCTT TCGCTACTGC TATATCTTGG CGTCCCATAA TATCCAATAC 10020 

GTTCAAGGCG TTTGTCGTAT TCTTGTCAAC TGATTGATTA CCTGCGACTG TTGTTACAGC 10080 

35 

TAATATCTCT AGTGGACTGT CAATTGCCCC CGCTAAAATT AATGCTATTG CATCATCGTG 1014 0 

TCCTCGATCA CAATCCATAA TAATCTTTCT TTTCATTTAT AT AT C CAC CT TTCTTAAGTT 10200 

4Q GTTATCGATA GCTTATGTAT ATTTATTTAT GTGGTGAATC ATGTTTATTT TGAAAAATAG 102 60 

TTTTAACTTT CT C AT ATTTT TGGATACAAA CACTATTTAT CTATTTTATG GCTTATAAAT 103 20 

TTATCCGATA TGCCTTATCA ACCTACCTCG CTAAAAATAG GATGTCTACA TATCTATACC 103 80 

45 GACTTTTGTC AACTCATTTT CACAACAATA TAAACAGCAA TTTATATGAT TGTTACATGA 10440 

TTCAAACAAT TTTTATGAAA AATATTTTCA TACACAGAAT ATATATTGAT ATTAAATTTC 10 500 

TCAAAAGCTA TA7TGAGAAT AATTAGGAGG GATGTTGATG AAATCTTTAT TTGAAAAAGC 10 560 

50 ACAGCAGTTC GGCAAGTCCT TTATGTTACC TATCGCAATC TTACCAGCTG CAGGTCTATT 10620 

GTTGGGTATC GGTGGTGCAT TAAGTAATCC AAACACCGTT AAAGCATACC CTATTTTAGA 106 30 
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AAATTT AC CG 


GTCATCTTTG 


CAATTGGTGT 


CGCAATCGGA 


TTATCTAGAA 


GCGATAAAGG 


10300 




TACTGCAGGT 


t TAGct GCGC 


TGCTCGGTTT 


CTTAATTATG 


AACGCAACTA 


TGAATGGCTT 


10860 


5 


ATTAACTATC 


ACGGGCACAT 


TGGCAAAAGA 


TCAGCTTGCA 


CAAAATGGAC 


AAGGCATGGT 


10920 




GCTCGGTATA 


CAAACGGTTG 


AAACCGGTGT 


TTTTGGCGGG 


ATTATCACAG 


GTATTATGAC 


10930 


W 


CGCAATACTT 


CACAACAAAT 


ATCACAAAGT 


GGTATTACCA 


CCGTATTTAG 


GTTTCTTTGG 


11040 


TGGCTCTAGA 


TTTGTCCCTA 


TTGTCACAGC 


ATTTGCCGCA 


ATCTTTTTAG 


GTGTATTGAT 


11100 




GTTTTTCATT 


TGGCCAAGCA 


TACAAGCCGG 


CATTTATCAT 


GTTGGTGGAT 


TTGTAACGAA 


11160 


15 


AACAGGTGCC 


ATCGGTACTT 


TTGTTTATGG 


CTTCATCTTA 


AGATTGTTAG 


GTCCACTCGG 


11220 




TTTACACCAT 


ATTXT1TACT 


TACCGTTTTG 


GCAGACGGCA 


CTTGGTGGTA 


CTTTAGAAGT 


11280 




CAAAGGGCAC 


TTAGTTCAAG 


GTACGCAGAA 


CATCTTCTTT 


GCTCAACTTG 


GTGATCCAGA 


11340 


20 


TGTGACGAAG 


TATTATTCAG 


GTGTGTCACG 


CTTTATGTCA 


GGCCGTTTTA 


TTACGATGAT 


11400 




GTTCGGCTTA 


TGTGGTGCCG 


CACTTGCAAT 


TTATCACACA 


GCTAAACCTG 


AACATAAAAA 


11460 




AGTTGTCGGC 


GGTTTAATGT 


TATCCGCTGC 


ACTCACTTCA 


TTTTTAACAG 


GTATTACCGA 


11520 


25 


ACCTTTAGAG 


TTTAGTTT CT 


TGTTTGTCGC 


ACCTATTCTT 


TATGTAATCC 


ATGCCTTCTT 


11530 




TGATGGATTA 


GCATTTATGA 


TGGCAGACAT 


TTTCAACATT 


ACAATTGGTC 


AAACCTTCAG 


11640 




TGGAGGCTTT 


ATCGATTTCT 


T A CTCTTT GG 


TGTGCTACAA 


GGTAATAGTA 


AAACAAACTA 


11700 


30 


CCTATACGTC 


ATACCTATTG 


GAATTGTGTG 


GTTCTGTTTG 


TATTACATCG 


TTTTCAGATT 


11760 




CTTAATTACG 


AAATTTAATT 


TCAAAACACC 


TGGTCGAGAA 


GATAAAGCTG 


CAGCACAACA 


11820 


35 


AGTTGAGGCT 


ACTGAAAGAG 


CACAAACTAT 


TGTTGCTGGT 


TTGGGAGGCA 


AAGATAACAT 


11880 


TGAAATCGTT 


GACTGTTGTG 


CAACGAGACT 


ACGCGTCACA 


CTTCATCAAA 


ATGACAAAGT 


11940 




CGATAAAGTA 


TTACTCGAAA 


GTACTGGTGC 


CAAAGGTGTA 


ATCCAGCAAG 


GCACTGGTGT 


12000 


40 


GCAAGTAATT 


TATGGGCCTC 


ACGTTACAGT 


TATCAAAAAT 


GAAATTGAAG 


AATTGCT CGG 


12060 




GGATTAAGAC 


TAACCGAAAT 


ATCAACAGAA 


CTAATGGCAA 


CGATGTACGA 


AGTAAGAAGT 


12120 




GACATCGTTG 


CrriTATTTT 


TAATGTTACA 


TTTGAAGCAT 


TAAGTTCATC 


ATGCACTGTA 


12180 


45 


GTGAGCCCGC 


AAATCGCCTC 


TGCTAGACAA 


TCATCTTAAT 


GCTATGATTA 


AAG CTT AAf^T 


1224 0 




GCCAGA1TTG 


AATTTAATTT 


CAACAACGAC 


TTTCACTACA 


TTAAAAATAG 


GGCCACTCGA 


12300 




CACATATAGT 


TGTATCAAAT 


AGCCCTTTAT 


ACAATTTTTT 


GGGTAAGGTT 


TTACAATTTT 


12360 


50 


T GGGATGGT A 


TAGATTTTAT 


AAAAAGTTAT 


TTAAGTTCTT 


CTGCTTCAGC 


CATAATATCT 


12420 




TTTAATGTTT 


TAGCTGAATG 


TGCGAACTTG 


CTTTGTTCTT 


CGTCGTTTAA 


TGGGATTTCT 


12480 



55 



445 



EP 0 786 519 A2 



TCCTCATATT CGCCTTCTAA TAATGCTGAT ACAGTCAATA CGGCATCTTC ATTTCTGAAA 1260 0 

ATCGCTTCAG TAATT CTAGC TAATCCCATT GCAACACCA7 AATAAGTGGC ACCTTTAGCT 12660 

5 

TGAATAATGT CATATGCTGC ATCACGTGTT TGAACAAAAA TTTGTTCAAT TTGCGCTTTG 1272 0 

CCCTCAGGAC GTTGTTCAAG TAATGTCTTC AAAGGTTGAC CCGCAATATT AGCGTGTGAC 12 780 

CATACTGGTA ATTCAGTGTC ACCATGTTCA CCAATAATTT GAGCATCGAC GCTACGTGGC 12 840 

w 

GCAACATCGn AcgyTcGCTT AACAATAATC TAAAGCGTGC AGAGTCTAAA ATTGTACCAG 12900 

AACCTATAAC ACGTTCTTTA GGTAAACCAG AGAATTTCCA TGTTGCATAC GCTAAAATAT 12 960 

, 5 CAACAGGATT TGTAGCTACC AAGAAAATAC CATCAAATTT TGATGCCATT ACTTCACCAA 13 020 

CAATTGATTT GAATATTTTC AAGTTTTTAG ATACTAAATC TAAACGTGTT TCTCCAGGTT 13 0 BO 

TTTGTGCAGC ACCAGCACAG ATGACAACTA GATCCGCATC ATGACAATCA CTGTATTCGC 1314 0 

20 CAGCTTTCAC ACGAACTGTT GTTGGAGAAT ATGGTGTGGC ATGTTTTAAA TCCATAACAT 132 00 

CTCCTCGAAC TTTTTCAGTG TCTAAATCAA TGATGACTAA TTCATCAACA ATGCTTTGGT 13260 

TCACTAATGA AAATGCGTAG CTTGAACCTA CTGCACCATT ACCTATTAAT ACAACTTTGT 13 320 

25 TCCCTTTAAA TTTGTTCATT ACAAAAACTC CCTTATGATT AATTCACTAA CATACATGTA 133 30 

GCTTCAAATA TGTTAGTTTA ATGCTGCTTA TTGACGATAC AAAAGCAAAT AAACATCTCT 1344 0 

TTTATTTTCA ACGCATAACT TAAAAGGTCA TGTGTCATCC GCTTTTAAGT TTGTGATTTA 13 500 

30 

TTTCACATAT AAAATGTAAC ATGCATTAAG TACTGGGTCA ATATTAAATT GTGATTTATT 13 560 

TCACATTTTA TTTTAATTTT T AC AC CTTTT TAATTTGTAT mCGATTACAT CTTAGATGTC 13 620 

TTTAGTCTTC GTACTTCGCC AGTGATTATT TACACTTTCA CATTTTTATT ATCATGTTTA 136 80 

35 

CTTTTTTCTA GGAAAACAAC AATGTTTTTT GAATTAGTCA AATAAATGCG CTCAATCGTC 13 740 

GGTGTGCAAA CAGACAATTG TACACAATGC TTATTGATAA GTATTTAAAA AATTAAAAAT 13 800 

40 GTCATACAAT TATCAAATTT GCCATTTTAT TTATATTTTC TCAAACCAAT TAATTGAATA 13 860 

TCGAAATTTT TAGTAGAATA ATCAAAATAT ACAGATTAAA GGAGGAGTAT CATGCTTACA 13 920 

GAACAAGAGA AAGACATTAT CAAACAAACG GTGC CTTT AC TTAAAGAGAA AGGGACAGAA 13 980 

45 ATTACGTCAA T CTTTT AT CC AAAAATGTTT AAAGCGCATC CTGAA CTTTT AAACATGTTT 14 04 0 

AATCAAACGA ACCAAAAACG AGGCATGCAA TCTTCAGCAT TAGCACAAGC TGTAATGGCC 1410 0 

GCAGCGGTTA ATATCGATAA CTTAAGTGTT ATTAAACCAG TCATTATGCC AGTCGCATAT 14160 

50 

AAACACTGCG CACTACAAGT TTATGCTGAA CATTATCCAA TTGTGGGGAA AAATTTATTA 14 220 

AAAGCCATTC AAGACGTGAC AGGATTAGAA GAAAATGACC CTGTCATTCA AGCTTGGGCA 142 80 
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(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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<xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


58 : 






GGTATTTTnG 


GAnGGGTACC 


TAAAGCAATT 


CCGGCAAAGG 


GTnAATCCAG 


GTACCGAAAT 


60 


GGACTTCCCG 


TTATCGATAA 


TACCGACATA 


TATTGTGACA 


AG TAGATTTT 


AT f5G A C ATTT 


120 


AGGCTTACTT 


TTACTTGTGA 


TAATTGCATG 


T ATG TTT A CT 


GGTATTTAt-C 


faTfa ATAPA 


180 


TATCATTCAA 


TTATTGATAT 


ATGTACCGTT 


TTGTTTTTTC 


TTAACTGCCt 


PGGTGAPflTT 
vjvj x \jn \_ \j x x 




ATTAACATCA 


ACACTCGGTG 


TGTTAGTTAG 


AGATACACAA 




AAfZPA ATZITT 


inn 
J u u 


AAGAATATTA 


TTTTACTTTT 


CACCAATTTT 


GTGGCTACCA 


AAG AAC C ATCi 


ulnl Lnu 1 o\j 


J o u 


TTTAATTCAT 


GAAATGATGA 


AATATAATCC 


AGTTTACTTT 


ATTGCTGAAT 


PATAcmTOP 

x x 


*t ^ U 


AGCAATTTTA 


TATCACGAAT 


GG T ATTTCAT 


GGATCATTGG 


AAATTAATGT 


TATAPAATTT 


4 3 0 


CGGTATTGTT 


GCCATTTTCT 


TTGCAATTGG 


TGCGTACTTA 


CACATGAAAT 


ATAGAGATCA 


54 0 


ATTTGCAGAC 


TTCTTGTAAT 


ATATTTATAT 


GACGAAACCC 


CGCTAACCAT 


TAATAAATGG 


6 00 


AAGTGGGGTT 


CATTTTTGTT 


TATAATTTAA 


GTAAATAACA 


TATTAAGTTG 


GTGTATTATG 


660 


AACGTTTTAA 


TAAAGAAATT 


TTATCATTTG 


GTAGTTCGAA 


TACTTTCTAA 


AATGATTACG 


720 


C CTCAAGTG A 


TTGATAAACC 


GCATATCGTA 


TTTATGATGA 


CTTTTCCAGA 


AGATATTAAG 


730 


CCTATCATCA 


AAGCATTAAA 


TAATTCGTCG 


TATCAGAAAA 


CTGTTTTAAC 


AACACCAAAA 


840 


CAAGCGCCTT 


ATTTATCTGA 


ACTTAGCGAC 


GATGTTGATG 


TGATAGAAAT 


GACTAATCGA 


900 


ACATTGGTAA 


AACAAATTAA 


GGCTTTGAAA 


AGCGCGCAGA 


TGATTATTAT 


CGATAATTAT 


960 


TACCTATTGC 


TAGGTGGATA 


TAATAAGACT 


TCTAATCAAC 


ACATTGTTCA 


AACGTGGCAT 


1020 


GCAAGTGGTG 


CATTAAAAAA 


CTTTGGCTTA 


ACAGATCATC 


AAGTCGATGT 


GTCTGACAAG 


1080 


GCAATGGTTC 


AGCAGTACCG 


TAAAGTTTAT 


CAAGCGACGG 


ArrriTACTT 


AGTGGGTTGT 


1140 


GAACAAATGT 


CACAATGTTT 


TAAACAGTCT 


TTAGGTGCAA 


CAGAAGAGCA 


AATGCTGTAT 


1200 


TTTGGGCTTC 


CGAGAATTAA 


TAAATATTAC 


ACAGCTGATA 


GAGCAACGGT 


TAAGGCAGAG 


1260 


TTAAAGGATA 


AATATGGAAT 


TACAAATAAG 


TTGGTATTAT 


ATGTACCAAC 


ATATAGAGAA 


1320 


GATAAAGCAG 


ATAATAGGGC 


TATTGATAAA 


GCTTATTTTG 


AAAAATGTTT 


ACCAGGATAT 


1380 
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ATCGACACGT CTACATTAAT GCTAATGTCA 
CCAATAGAAG CTAGCTTGTT AGATATTCCA 
5 TATGATCAGG TGAGAGGCCT GAATCAATTT 

TATACTGAAG AAGATTTAAT AATGACGATA 
TTTAAAGATT GGCATAAGTA TAATACTGAT 

10 

GATAAGATGG TGACAAAATG AGGTTTACGA 
CAATTCGACA ATTGTTAATA TCTATTGAGA 
ATGGTGGTTC TACTGATCAA ACAATTCCTA 

IS 

ATATTTCATT AATACAATTA CAAAATGCTT 
TGGATATCAA AATGACAGAT CCACATGATA 

20 CAATCGTATT GCCAGGTAAA TTAGATAGGT 

TTGATATGGT AATAGGGCAG CGAGCTTACA 
CTGATGAGTT TATTAAAGAC AATCGAATCG 

25 CAATGATGTC TTTTGACGGA AAGTTATTCA 

AAACTTTAGC TAACaCATAC AATCACGCAA 
ATATACATTT AGTTTCACAG ATGATTGTCG 

30 GTAACGATGA AGATTTTAAT AGATATATCA 

TGGAAATGTT ACTATTACCT GAACAAAGGC 
TATTCAATAA TTCATTAAAA TATTATATGA 

35 

TTCAACTCGT AAAAGACTAT ATTATGTCTA 
TGTTTGACAT TATAAATACA GTTGAATTTA 
AATTGTGGCG ACAAACATTA ATTCAAGTGG 

40 

TGATACAACT TAAAGGGAGA AAGTTTGCAC 
GTGTACATTG ATGACCATAA ACTGCAATCC 

45 AATGAAACGT GTAATAACAT ATGGCACATA 

GCTTCGTCGT GCAAGAGAGA TGGGCGATTA 
TAATCAAATT AAACATAAAA AATCTTATTA 

50 ATCAATACGC TATGTCGATT TAGTCATTCC 

TGTCGAAAAA TTTGATGTAG ATGTTTTTGT 
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GATATAATTA TTAGCGACTA TAGTTCGCTG 15 00 

ACTATATTTT ATGTGTATGA TGAAGGAACA 15 60 

TACAAAGCAA TACCGGATAG CTACAAAGTG 16 20 

CAAGAAAAAG AACATCTATT AAGTCCGTTA 16 3 0 

AAAAGTTTAC ATCAGCTCAC AGAATATATA 174 0 

TAATCATACC TACATGTAAT AATGAGGCAA 18 CO 

GTAAAGAACA CTATAGAATC CTTTGTATTG 18 6 0 

TGATTGAACG GTTACAAAGA GAACTCAAGC 192 0 

CGATAGCTAC GTGTATTAAT AAAGGTTTGA 198 0 

GTGACGCATT TATGGTCA7A AAACCAACAT 2 04 0 

TAACTGCTGC TTTCAAAAAT AATGATAATA 2100 

ATTACCATGG TGAATGGAAA TTGAAAAGTG 2160 

TTACATTAAC GGAACAACCA GATTTGTTAT 22 2 0 

GTGCTAAATT TGCTGAATTA CAGTGTGaCG 22 80 

TACTTGTCAA GGCGATGCAA AAAGCTACGG 2 34 0 

GAGATAACGA TATAGATACA CATGCTACAA 24 00 

CAGAAATTAT GAAAATAAGA CAACGAGTCA 24 6 0 

TATTATATAG TGATATGGTT GATCGTATTT 2 520 

ACGAACACCC AGCAGTAACG CACACGACAA 2 58 0 

TGCAGCATTC TGATTATGTA TCGCAAAACA 264 0 

TTGGTGAGAA TTGGGATAGA GAAATATACG 270 0 

GCATTAATAG GC CGACTTAT AAAAAATTCT 2 76 0 

ATCGAACAAA ATCAATGTTA AAACGATAAC 2 82 0 

TATGATGTGA CAATATGAGG AGGATAACTT 2 880 

TGACTTACTT CACTATGGTC ATATCGAATT 2 94 0 

TTTAATAGTA GCATTATCAA CAGATGAATT 3 0 00 

TGATTATGAA CAACGAAAAA TGATGCTTGa 3C60 

AG AAAAGGG C TGGGGACAAA AAGAAGACGA 3 120 

TATGGGACAT G AC TGGGAAG GTGAATTCGA 3130 
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w 



15 



20 



25 



30 



35 



40 



45 



TAAAATCAAA 


CAAGAATTAT 


ATGGTAAAGA 


TGCTAAATAA 


ATTATATAGA 


ACTATCGATA 


3300 


CTAAACGATA 


AATTAACTTA 


GG TT ATT AT A 


AAATAAATAT 


AAAACGGACA 


AGTTTCGCAG 


3360 


CTTTATAATG 


TGCAACTTGT 


CCGTTTTTAG 


TATGTTTTAT 


TTTCTTTTTC 


TAAATAAACG 


3420 


ATTGATTATC 


ATATGAACAA 


TAAGTGCTAA 


TCCAGCGACA 


AGGCATGTAC 


CACCAATGAT 


3480 


AGTGAATAAT 


GGATGTTCTT 


CCCACATACT 


TTTAGCAACA 


GTATTTGCCT 


TTTGAATAAT 


3540 


TGGCTGATGA 


ACTTCTACAG 


TTGGAGGTCC 


ATAATCTTTA 


TTAATAAATT 


CTCTTGGATA 


3600 


GTCCGCGTGT 


ACTTTACCAT 


CTTCGACTAC 


AAGTTTATAA 


TCTTTTTTAC 


TAAAATCACT 


3660 


TGGTAAAACA 


TCGTAAAGAT 


CATTTTCAAC 


ATAATATTTC 


TTACCATTTA 


TCCTTTGCTC 


3720 


ACCTTTAGAC 


AATATTTTTA 


CATATTTATA 


CTGATCAAAT 


GAGCGTTCCA 


TTAATGCATT 


37BO 


CCCCATCATA 


TTACGTTGCT 


TCTCGCCACC 


AAGGTTTTTA 


TAGTCTCCTG 


CACCCATGAT 


3840 


AACTTGATTA 


ATTCTAAATT 


TACCTCGTTT 


GGTAGTAATC 


GTATGGTTGT 


AATTTGCTGT 


3900 


ATCACTTGAT 


CCAGTTTTTA 


AACCATCTGT 


ACCCGGCAAA 


CTCATTTTTG 


CACCTTCCAA 


3960 


TGAAAAGTTG 


AATGTGTAAT 


ACGTAACTGC 


ATGCGTTGTT 


GGTGCTAACT 


GCTTTGTAAA 


4020 


GTCTAATATT 


TTAGGTGTCT 


CTTTAATCAC 


GTGTAAATCT 


AAAATGGCAT 


AGTCTCTAGC 


4080 


AGTCGTTACA 


GTACGTTCTT 


GGTCTTTATA 


CTTTGTTGGT 


GCAAATGTAC 


GTAATCTTGA 


4140 


ATTTTCAGCA 


CCCGTTGGAT 


TGACGAAATG 


TGTATTTTTC 


ATTCCGATAG 


CTTTAGCTTT 


4200 


GTTATTCATT 


AAATCAACGA 


AATCGCTGGT 


GTTTTTTGAA 


ACCTTCTTAG 


CTAAAATTAA 


4260 


TGCCGCGGCA 


TTACTAGAAT 


TAG ATA CTGT 


AATTTGTAAT 


AGGTCTGCGA 


TTGTCCATAC 


4320 


TTGTCCAGGA 


TATAGTTTCG 


TATTACTCAA 


CTCAGGTAGT 


GTAGACATAA 


TATATTCTTT 


4380 


GTTCGTCATT 


GTGACTGTGT 


CATCAAGTGA 


AAGCTGCCCC 


TTATTTACAG 


CTTC CAATGT 


4440 


TAAGTACATT 


GTCATTAATT 


TAGTCATAGA 


CGCTGGAtTC 


CACTTAGTAT 


CGATATTGTA 


4500 


TTGATACAGT 


AATTGTCCAG 


TTTGACTTAC 


ATTAACAGCA 


CTCGTCGGTT 


CGTATGCAGC 


4560 


CGACAAACCT 


GCATAACCAT 


ATTGATTTGC 


TGCTTGTACA 


GGGGTTACGT 


CACTGTTAGT 


4620 


AGCTTGTGCA 


TATGGTGTCA 


TAATACTTAA 


TG TT AAA CAT 


AAAATGATGA 


TAATAGATAT 


4680 


TAAATTTTTC 


ATAAAG CGTT 


AATCTTCCCT 


TTTCCAATTC 


TTAAATATTC 


CCTAAAAGCA 


4740 


ATGGTTATTC 


CTACTTACGG 


AAATCATTGC 


TAATTCACTT 


CACCTTAATT 


AAATTGTTGA 


4800 


AAATAAAGTT 


TTCTGCAGTT 


AATTTGAAAA 


ATAATGCAAA 


TATATTACGT 


G TG TAG CT AA 


4860 


AGGTGTTATA 


ATGTTTGTAC 


GAAGAGCAAA 


CTTACTCAAA 


AGCGATTAAT 


TTTCATGTTT 


4920 


TAATATAAAG 


ACTTTGAGAA 


GTTATTACAA 


AAAATGCAAT 


AGAAATATTC 


TATCATATAA 


4980 
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20 



AAGTATATGA TAGAAATGCA TGTATCTATC TAAATGAATT AACTATAAAT TTCAAACAGA 5100 

AGAGGTAAAA CTATGAAACG AGAAAATCCA TTGTTTTT CT TATTTAAAAA ACTATCATGG 516 0 

CCAGTGGGTC TTATCGTTGC AGCTATCACT ATTTCATCAC TAGGGAGCTT AAGTGGACTA 522 0 

TTAGTGCCAC TGTTTACTGG ACGAATTGTA GATAAATTTT CCgTGAGCCA TAT CAATTGG 528 0 

AATCtAATCS CATTATTTGG TGGTATCTTT GTCATCAATG CTTTATTAAG CGGATTAGGT 534 0 

TTATATTTAT TAAGTAAAA7 7GGTGAAAAG ATTATTTATG CGATACGCTC AGTTTTATGG 54 00 

GAGCATATCA TACAATTAAA AATGCCATTC TTTGACAAAA ATGAAAGTGG TCAATTAATG 54 6 0 

AGTCGATTAA CTGACGATAC GAAAGTGATA AATGAATTTA TTTCACAAAA GCTACCTmAC 552 0 

TT ATT AC CAT CAATCGTTAC ATt AGTTGGG TCACTAATCA TGTTATTTAT TTTAGATTGG 558 0 

AAAATGACAT TATTAACATT TATAACGATA CCGATATTCG TTTTaATTAT GATTCCTCTA 564 0 

GGTCGTATTA TGCAAAAGAT ATCGACAAGT ACACAATCTG AAATTGCAAA CTTCAGTGGT 570 0 

TTGTTAGGGC GTGTCCTAAC TGAAATGCGT CTTGTTAAAA TATCAAATAC AGAGCGTCTT 57 6 0 

GAATTAGATA ATGCACATAA AAATTTGAAT GAAATATATA AATTAGGTTT AAAACAGGCT 5820 

25 AAAATTGCGG CAGTTGTACA ACCAATTTCA GGTATAGTTA TGTTGCTAAC AATTGCAATT 58 80 

ATTTTAGGTT TTGGTGCATT AGAAATTGCG ACTGGTGCAA TCACTGCAGG TACATTAATT 594 0 

GCAATGATAT TTTATGTTAT TCAGTTATCT ATGCCTTTAA TCAATCTTTC CACGTTAGTT 6000 

ACAGATTATA AAAAGGCAGT CGG TGCAAGT AGTAGAATAT ACGAAATCAT GCAAGAACCT 6060 

ATTGAACCGA CAGAAGCTCT TGAAGATTCT GAAAATGTAT TAATTGATGA CGGTGTATTG 6120 

TCATTTGAAC ATGTAGACTT TAAATATGAT GTGAAGAAAA TATTAGATGA TGTGTCGTTC 618 0 

CAAATCCCAC AAGGTCAAGT GAGTGCTTTT GTAGGCCCTT CTGGGTCTGG TAAAAGTACG 624 0 

ATATTTAATC TGATAGAACG TATGTATGAA ATTGAGTCAG GTGATATTAA ATATGGCCTT 6 300 

GAAAGTGTCT ATGATATCCC GTTATCTAAG TGGCGACGCA AAATTGGATA TGTTATGCAA 6 360 

TCAAATTCGA TGATGAGTGG TACAATTAGA GACAATATTT TATACGGAAT TAATCGTCAT 642 0 

GTTTCAGATG AAGAACTTAT TAATTATGCT AAATTAGCGA ACTGTCATGA TTTT A T CAT G 64 80 

CAATTTGATG AAGGATATGA CACGCTTGTA GGTGAACGAG GATTGAAACT GTCTGGCGGA 6 54 0 

CAACGTCAAC GTATTGATAT TGCTAGAAGT TTTGTTAAAA ATCCTGATAT TTTGTTACTT 6600 

GATGAAGCAA CAGCTAATCT CGATAGTGAA AGTGAATTGA AAATTCAAGA AGCTTTAGAA 6660 

ACATTGATGG AAGG TAG AAC AACGATTGTC ATTGCGCATC GTTTGTCTAC AATTAAAAAA 6720 

GCCGGTCAAA TTATATTCTT AGACAAAGGA CAGGTAACAG GTAAAGGTAC GCA7TCAGAA 6 780 
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TTTTATATAT ATAAGTAAGC TTGGAGCAAA TACACATATA CCATCGAGGA AATTAAAGTG 6 90 0 

TGGCACATTG ATGGATATAG ATGTTAATAA ATTGCTTCAA GCTTTTGTCT ATTTTAAATC 6 96 0 

5 ATTTGAGAAG TTACGACATA ATAATTCTTA AATTAATGAA ATCGATATTT TAAGAAAAAA 702 0 

ATGCTCATGG TATAATACAA GTTATAAGCA AACATACATA T A TT AAA T A C TGTAGCCACG 7 08 0 

AGTCATAATT CTTHATATTT TACATAGCAA TTTAACTGAT TTTAGAGTCC ACGGTACAGA 714 0 

W AGTTTGATAT TTCAATGTTT CTAAATTTTT AAAAAATTAA ATCATAGGTG GGTGCCAAAT 7200 

GTTTTTATTA ATCAACATTA TTGGTCTAAT TGTATTTCTT GGTATTGCGG TATTATTTTC 726 0 

AAGAGATCGC AAAAATATCC AATGGCAATC AATTGGGATC TTAGTTGTTT TAAACCTGTT 732 0 

IS 

TTTAGCATGG TTCTTTATTT ATTTTGATTG GGGTCAAAAA GCAGTAAGAG GAGCAGCCAA 73 BO 

TGGTATCGCT TGGGTAGTTC AGTCAGCGCA TGCTGGTACA GGTTTTGCAT TTGCAAGTTT 7440 

GACAAATGTT AAAATGATGG ATATGGCTGT TGCAGCCTTA TTCCCAATAT TATTAATAGT 7500 

20 

GCCATTATTT GATATCTTAA TGTACTTTAA TATTTTACCG AAAATTATTG G AGG T ATTGG 756 0 

TTGGTTACTA GCTAAAGTAA CAAGACAACC TAAATTCGAG TCATTCTTTG GGATAGAAAT 7 620 

25 GATGTTCTTA GGAAATACTG AAGCATTAGC CGTATCAAGT GAGCAACTAA AACGTATGAA 7680 

TGAAATGCGT GTATTAACAA TCGCAATGAT GTCAATGAGC TCTGTATCGG GAGCTATTGT 774 0 

AGGTGCGTAT GTACAAATGG TACCAGGAGA ACTGGTACTA ACGGCAATTC CACTAAATAT 7300 

30 CGTTAACGCG ATTATTGTGT CATGCTTGTT GAATCCAGTA AGTGTTGAAG AGAAAGAAGA 7 36 0 

TATTATTTAC AGTCTTAAAA ACAATGAAGT TGAACGTCAA CCATTCTTCT CATTCCTTGG 7 920 

AGATTCTGTA TTAGCAGCAG GTAAATTAGT ATTAATCATC ATCGCATTTG TTATTAGTTT 7 980 

35 TGTAGCGTTA GCTGATCTAT TTGATCGTTT TATCAATTTG ATTACAGGAT TGATAGCAGG 804 0 

ATG<3ATAGGC ATAAAAGGTA GTTTCGGTTT AAACCAAATT TTAGGTGTGT TTATGTATCC 8100 

ATTTGCGCTA TTACTCGGTT TACCTTATGA TGAAGCGTGG TTGGTAGCAC AACAAATGGC 816 0 

40 

TAAGAAAATT GTTACAAATG AATTTG TTGT TATGGGTGAA ATTTCTAAAG ATATTGCATC 82 2 0 

TTATACACCA CACCATCGTG CGGTTATTAC AACATTCTTA ATTTCATTTG CAAACTTCTC 82 30 

4s AACGATTGGT ATGATTATCG GTACATTGAA AGGCATTGTT GATAAAAAGA CATCAGACTT 8340 

TGT AT CT AAA TATGTACCTA TGATGCTATT ATCAGGTATC CTAGTTTCAT TATTAACAGC 84 00 

AGCTTTCGTT GGTTTATTTG CATGGTAATA TGTCGAAGAG TGACTATGAT AATACATTTT 84 6 0 

so AACTAATAAA TATGTCCAGG CATGTCGTCT ATTGATATAG GTGAGATGCT TGGACTTTTT 8 520 

TATTATTGAT ATAAAGGTAT nTAAATATTT TTAAAGTTAC CGAAATTGAA G CATT AT AAA 8 5 80 
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GACAGTAAGG ACTAGGTACA GTCATAGTAC TTCGAGCAAA ATTTGTTTTG TTATTATAAA 8 7 00 

CAACACAAAG GAGATAACTT CTCTAnTGAA GAAGTTAAAA ACATTATAGC AGACAATGAA 87 6 0 

5 ATGAAAGTAA ATTAAAAAT 87 7 9 

(2) INFORMATION FOR SEQ ID NO: 59: 

<i) SEQUENCE CHARACTERISTICS: 
io (A) LENGTH : 31096 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
<D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

GTTGCAGTAG TCAAAGAATT AAACAAGGTG AAGGcGTGTA GCTTGCACAC CCGAAAATGT 6 0 

20 GCGTAAGTTA aCGGATGCAG GACATAAAGT AATTGTTGAA AAAAATGCTG GCATTGGTTC 12 0 

AGGATTTTCT AACGATATGT ATGAAAAAGA AGGCGCTAAG ATCGTAACTC ACGAACAAGC 180 

ATGGGAAGCT GATCTTGTTA TCAAAGTAAA AGAACCTCAT GAAAGCGAAT ATCAATATTT 24 0 

25 CAAAAAGAAT CAAATTATCT GGGGATTTTT ACATCTAGCA TCTTCAAAAG AAATAGTAGA 3 00 

AAAAATGCAA GAAGTTGGTG TAACTGCGAT TAGTGGTGAA ACCATTATAA AAAATGGAAA 3 60 

AG CAG AATT A TTAGCGCCAA TGAGTG CTAT AGCAGGTCAA CGCTCAGCAA TTATGGGAGC 42 0 

30 

TTACTACTCT GAAGCACAAC ATGGTGGTCA AGGTACTTTA GTGACTGGTG TACATGAAAA 4 80 

TGTGGATATA CCTGGTAGTA CATATGTGAT TTTCGGTGGT GGAGTAGCAG CAACAAATGC 54 0 

AGCAAATGTT GCCTTGGGAC TAAATGCTAA AGTAATCATT ATCGAGTTAA ACGATGACCG 6 00 

35 

CATTAAATAT CTTGAAGATA TGTATGCAGA AAAAGATGTC ACAGTAGTCA AATCAACACC 66 0 

AGA^AATTTA GCAGAACAAA TTAAGAAAGC AGATGTATTT ATTTCTACAA TTTTAATTTC 720 

^ AGGTGCGAAA CCGCCAAAAT TGGTTACTCG TGAGATGGTT AAATCAATGA AAAAAGGTTC 7 80 

AGTATTAATC GATATAGCTA TTGACCAAGG TGGAACTATT GAAACAATTA GACCAACTAC 84 0 

AATTTCTGAT CCAGTGTATG AAGAAGAAGG TGTGATTCAT TATGGTGTAC CAAATCAACC 900 

45 AGGAG CAGTC CCAAGAACTT CAACAATGGC ATTAGCACAA GGAAATATTG ATTATATATT 960 

AGAAATTTGT GACAAAGGCT TAGAACAAGC AATTAAAGAT AATGAAGCCT TAAGTACTGG 1020 

TGTAAACATT TACCAAGGAC AAGTGACAAA TCAAGGATTA GCTTCATCAC ATGACCTAGA 1080 

so TTATAAAGAA ATATTAAATG TTATCGAATA GATAGTAATT TAAATGAAAT TGAGTGAAAT 114 0 

GAATATTTTA AAT A T AG CAT TATAGTTTGG ACTAAAAATT TACAAAACGG AAGGATGTAA 12 0 0 
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TCGAAGAAGC TAAAGCAAGC ATTAAACCAT TTATTCGTCG AACACCTCTA ATTAAATCAA 1320 

TGTATTTAAG CCAAAGTATA ACTAAAG3GA ATGTATTTCT AAAATTAGAA AATATGCAAT 13 80 

5 TCACAGGATC TTTTAAATTT AGAGGCGCTA gCAATnAAAA TTAATCACTT AACAGATGAA 144 0 

CAAAAAGAAA AAGGCATTAT CGCAGCATCT GCTGGGgAAC CATGCACAAG GTGTTGCTTT 1500 

AACAGCTAAA TTATTAGGCA TTGATGCAAC GATTGTAA7G CCTGAAACAG CACCACAAGC 156 0 

r ° GAAACAACAA GCAACAAAAG GCTATGGGGC AAAGGTTATT TTAAAAGGTA AAAACTTTAA 162 0 

CGAAACTAGA CTTTATATGG AAGAATTAGC GAAAGAAAAT GGCATGACAA TCGTTCATCC 16 30 

ATATGACGAT AAGTTTGTAA TGGCAGGCCA AGGAACAATT GGTTTAGAAA TTTTAGATGA 174 0 

TATTTGGAAT GTGAATACAG TCATCGTACC AGTTGGCGGT GGAGGATTAA TTGCAGGTAT 18 0 0 

TGCCACCGCA TTAAAATCAT TTAACCCTTC AATTCATATT ATCGGTGTTC AATCTGAGAA 1860 

^ TGTTCATGGT ATGGCTGAGT CTTTCTATAA GAGAGATTTA ACTGAACATC GAGTGGATAG 192 0 

CACAATAGCA GATGGTTGTG ATGTAAAAGT TCCTGGTGAA CAAACATATG AAGTAGTTAA 198 0 

ACATTTAGTA GATGAATTTA TTCTTGTTAC TGAAGAAGAA ATTGAACATG CTATGAAAGA 2 04 0 

25 TTTAATGCAG CGTGCCAAAA TTATTACTGA AGGTGCAGGC GCATTACCAA CAGCTGCAAT 2100 

TTTAAGTGGA AAAATAAACA ATAAATGGCT TGAAGATAAA AATGTTGTTG CATTAGTTTC 2160 

AGGCGGGAAT GTTGACTTAA CTAGAGTTTC AGGTGTCATT GAACATGGAC TGAATATTGC 2 22 0 

30 AGATACAAGC AAGGGTGTGG TAGGTTAAAA CATTTAATCT TAAAAATGAG GTGTAATTAT 2280 

GTCAAATGGT AAAGAATTAC AAAAAAATAT AGGTTTCTTC TCAGCGTTTG CTATTGTTAT 234 0 

GGGGACAGTT ATTGGTTCAG GAGTATTCTT TAAAATATCA AACGTAACAG AAGTAACAGG 24 CO 

35 

AACAGCAGGA ATGGCCTTGT TTGTATGGTT CCTAGGCGGC ATCATTACCA TTTGTGCGGG 24 60 

GTTAACAGCA GCAGAACTTG CTGCTGCAAT CCCTGAAACA GGTGGCTTAA CGAAGTATAT 2 520 

AGAATATACA TACGGTGATT TCTGGGGCTT CCTATCAGGT TGGGCGCAAT CATTTATTTA 25 80 

40 

TTTTCCAGCT AACGTAGCAG CATTGTCTAT CGTATTTGCG ACACAGCTAA TTAATTTATT 264 0 

CCATTTATCT ATAGGTTCGT TAATACCAAT AGCAATCGCA TCTGCGTTAT CTATTGTGTT 2700 

GATAAATTTC CTAGGTTCAA AAGCAGGCGG AATTTTACAA TCAGTTACTT TAGTAATTAA 2760 

45 

ACTGATTCCA ATCATCGTTA TTGTAATTTT TGGTATTTTT CAATCTGGAG ATATCACTTT 2820 

TTCATTAATT CCAACTACAG GTAATTCaGG AAATGGCTTC TTTACAGCAA TTGGTAGTGG 2 8 80 

so TTTATTAGCA ACTATGTTTG CATATGATGG TTGG ATT CAT GTAGGAAATG TTGCGGGGGA 2 94 0 

ACTTAAAAAT CCTAAACGCG ATTTACCTTT AGCGATTTCA GTTGGTATCG GTTGTATTAT 3 000 
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7GGTAATTTA 


AATGCAGCTT 


CAGATACATC 


AAAAATATTA 


TTTGGTGAAA 


ATGGCGGTAA 


3120 




GATTATTACA 


ATCGGTATAT 


TAATTTCTGT 


TTATGGTACG 


ATCAATGGCT 


ATACTATGAC 


3180 


5 


TGGTATGCGC 


GTACCATATG 


CAATGGCTGA 


AAGAAAATTA 


TTGCCATTTA 


GCCATTTATT 


3240 




CGCAAAATTA 


ACAAAATCTG 


GCGCACCATG 


GTTTGGCGCA 


ATTATACAAC 


TTATAATCGC 


3300 




TAT CATCATG 


ATGTCAATGG 


GAGCATTTGA 


TACAATTACA 


AATATGTTAA 


TCTTTGTTAT 


3360 


10 


TTGGTTGTTC 


TATTGTATGT 


CATTTGTTGC 


GGTAATAATT 


TTAAGAAAAC 


GTGAACCAAA 


3420 




TATGGAACGA 


CCATATAAAG 


TACCGTTATA 


TCCGATCATA 


CCTTTAATTG 


CTATTTTGGC 


3430 


15 


AGGATCATTT 


GTATTAATTA 


ATACACTGTT 


TACACAATTT 


ATATTAGCAA 


TCATTGGAAT 


3540 


TCTAATAACA 


GCACTTGGTA 


TACCAGTTTA 


TTACTATAAA 


AAGAAACAAA 


AAGCAGCATA 


3600 




AGGTAAGATA 


ACTAGCATTG 


AGAATAAATG 


GATGGACTAC 


TAATAAATTT 


AAAGTTTTAC 


3660 


20 


ACATTAAAAT 


CAAAAACCAT 


TCAATTATTC 


TATGGAACAG 


ACAAATTTCT 


GTTATGGAAT 


3720 




TTGTCTGTTT 


TTCAAAAGTA 


TAGGGAGGCA 


AATAGAGATG 


GAAAAGCCGT 


CAAGAGAGGC 


3780 




ATTTGAAGGC 


AATAATAAGT 


TGTTAATAGG 


AATTGTTCTA 


AGTGTAATAA 


CGTTTTGGCT 


3840 


25 


ATTTGCACAA 


TCATTGGTTA 


ATGTTGTACC 


AATACTTGAA 


GATAGTTTCA 


ATACAGATAT 


3900 




TGGAACGGTT 


AATATCG CCG 


TTAGTATAAC 


TGCTTTATTT 


TCAGGAATGT 


TTGTAGTAGG 


3960 




AGCAGGTGGT 


CTTGCTGATA 


AATATGGCAG 


AATTAAACTC 


ACGAACATTG 


GTATTATCTT 


4020 


30 


AAATATATTA 


GGTTCATTA7 


TAATCATTAT 


TTCAAATATT 


CCTTTATTAC 


TTATTATAGG 


4080 




AAGATTAATT 


CAAGGACTTT 


CAGCAGCATG 


TATTATGCCT 


GCAACTTTGT 


CTATTATTAA 


4140 




GTCATATTAC 


ATTGGGAAAG 


ATAGACAACG 


CGCTTTAAGT 


TATTGGTCAA 


TTGGCTCATG 


4200 


35 


GGGCGGCTCT 


GGTGTTTGTT 


CATTTTTTGG 


AGGTGCAGTT 


GCAACGCTTT 


TAGGTTGGCG 


4260 




TTGGATTTTC 


ATCCTATCAA 


TTATAATTTC 


ATTAATTGCA 


CTGTTTCTTA 


TTAAAGGCAC 


4320 


40 


ACCTGAAACT 


AAATCTAAAT 


CGATTTCTCT 


AAATAAATTT 


GACATTAAAG 


GTCTGGTTCT 


4380 


TTTAGTCATT 


ATGCTCCTCA 


GTTTAAATAT 


TTTAATTACT 


AAAGGATCAG 


AATTAGGTGT 


4440 




AACCTCACTT 


CTTTTTATTA 


CTTTATTAGC 


TATTGCAATT 


GGATCTTTTA 


GTTTATTTAT 


4500 


45 


AGTTCTTGAA 


AAGCGTGCTA 


CAAATCCTTT 


AATCGATTTT 


AAATTATTTA 


AAAATAAAGC 


4560 




7TACACAGGT 


GCAACAGCTT 


CAAACTTTTT 


GTTAAATGGT 


GTTGCAGGAA 


CATTAATAGT 


4620 




AGCCAACACA 


TTTGTTCAAA 


GAGGTTTAGG 


ATATTCTTCA 


TTGCAAGCAG 


GAAGTTTATC 


4690 


50 


AATCACTTAT 


TTAGTAATGG 


TACTAATTAT 


GATTCGTGTT 


GGTGAAAAGT 


TACTTCAAAC 


4740 




ACTCGGATGC 


AAGAAACCAA 


TGTTAATTGG 


AACAGGAGTT 


CTTATTGTCG 


GAGAATGTCT 


4800 
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ATTCTTTGG? TTAGGACTAG GGATATATGC TACACCATCA ACAGATACAG CAATTGCAAA 4 92 0 

TGCACCGTTA GAAAAAGTAG GCGTTGCTGC AGGTATCTAT AAAATGGCTT CTG CATTAGG 4 9 80 

5 TGGAGCATTT GGCGTCGCAT TGAGTGGTGC AGTATATGCA ATCGTATCAA ATATGaCAAA 5 04 0 

CA7TTATACA GGTGcAATGa TTGnCATTAT GGTTaAATGC AGGTATGGGa ATATTATCaT 5100 

TCGTTATCAT TTTGtTACTT GTGcCTAAAC mAAACGACAC TCAATTATGA TAATTGAGAA 516 0 

10 

TTAAATTGAA ATCATACAAG TCGCTACAAT ATTAAACAAA AATATAAACC GATTCTTATG 522 0 

TGTCATTATT TTAAATGAAC ATAGGGATTG GTTTTTTATT ACTCTTTTAC GCTACTTTAT 52 8 0 

TTATAATTAT TATAAATTGT CACAAATTCA ATTTACCTTA CAATATATTT TGTGTTATTA 5 34 0 

15 

TATTCTGGAG CATAAATAAA TTGTTCAACA CATAGTTGTA ATGTGTTTCA ATACTTTTTG 54 0 0 

GATAGATTGC GAAATTGTAT TGAATCGTCA TCGTTTTAAA TTTTTAAATG AGAATGGAAT 54 6 0 

20 GAGCATTACA ATACACAAGC AATCAAAAGT AAATACATTC ACAACACAAC AGAGACATAA 552 0 

CAACAAGATA AGGAGTGAAC AATAGCTGTG AATTATCGTG ATAAAATTCA AAAGTTTAGT 5580 

ATTCGTAAAT ATACAGTTGG TACATTTTCA ACTGTCATTG CGACATTGGT ATTTTTAGGA 564 0 

25 TTCAATACAT CACAAGCACA TGCTGCTGAA ACAAATCAAC CAGCAAGCGT GGTTAAACAG 5700 

AAACAACAAA GTAATAATGA ACAGACTGAG AATCGAGAAT CTCAAGTACA AAATTCTCAA 5 760 

AATT C AC AAA ATGGTCAATC ATTATCTGCT ACTCATGAAA ATGAGCAACC AAATATTAGT 58 20 

30 CAAGCTAATT TAGTAGATCA AAAAGTAGCG CAATCATCTA CTACTAATGA TGAACAACCA 5 8 80 

G CAT CTCAAA ATGTAAATAC AAAGAAAGAT TCGGCAACGG CTG CGACAAC ACAACCAGAT 5 94 0 

AAAGAACAAA GTAAGCATAA ACAAAACGAA AGTCAATCTG CTAATAAAAA TGGAAACGAC 6 COO 

35 

AATAGAGCGG CTCATGTAGA AAATCATGAA G CAAATG TAG TAACAGCTTC AG ATT CAT CT 6 0 60 

GATAATGGTA ACGTACAACA TGACCGAAAT GAATTACAAG CGTTTTTTGA TGCAAATTAT 6120 

CATGATTATC GCTTTATTGA CCGTGAAAAT GCAGATTCTG GCACATTTAA CTATGTAAAA 6180 

40 

GGCATTTTTG ATAAGATTAA TACGTTATTA GGCAGTAATG ATCCAATAAA CAATAAAGAC 624 0 

TTGCAACTTG CATACAAAGA ATTGGAACAA GCTGTTGCTT TAATTCGTAC AATGCCTCAA 6 3 00 

45 CGTCAACAGA CTAGCCGACG TTCAAATAGA ATTCAAACGC GTTCGGTTGA GTCAAGAGCT 636 0 

GCAGAGCCTA GATCAGTATC AGACTATCAA AATGCAAATT CATCATATTA TGTTGAAAAT 64 2 0 

GCTAATGATG GTTCGGGCTA TCCTGTTGGT ACATATATCa ATGCTTCTAG TAAAGGGGCG 64 8 0 

50 C CAT AT AATT TACCAACTAC AC CATGG AAT ACATTGAAGG CCTCTGACTC AAAGGAAATT 6 54 0 

GCTCTTATGA CAGCG AAA CA AACTGGAGAC GGGTACCAAT GGGTTATTAA GTTTAATAAA 66 0 0 
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GTAGGAAGAA CTGACTTTGT AACAGTTAAT TCAGATGGAA CAAATGTACA ATGGAGTCAT 6 72 0 

GGAGCAGGAG CAGGTGCAAA TAAACCACTT CAACAAATGT GGGAATATGG AGTAAATGAT 676 0 

5 CCTCATCGTT CACATGACTT TAAAATAAGA AATAGAAGTG GCCAAGTAAT ATATGACTGG 6 84 0 

CCAACTGTCC ATATTTATTC TTTAGAAGAT TTATCTAGAG CGAGTGATTA TTTTAGTGAA 6 90 0 

GCTGGAGCGA CACCTGCTAC TAAAGCTTTT GGTAGACAAA ATTTTGAATA TATTAATGGT 6 96 0 

10 CAAAAACCTG CTGAATCACC GGGTGTTCCT AAAGTTTATA CTTTCATCGG TCAAGGTGAT 702 0 

GCAAGTTATA CAATTTCATT TAAAACACAA GGTCCAACTG TTAATAAATT GTACTATGCA 70B0 

GCAGGTGGGC GTGCTTTAGA GTACAATCAA TTATTTATGT ACAGTCAACT ATACGTCGAA 714 0 

TCAACGCAAG ACCATCAACA ACGTCTTAAT GGTTTAAGAC AAGTGGTTAA TCGTACATAT 720 0 

CGCATAGGTA CAACTAAACG TGTAGAAGTG AGTCAAGGAA ATGTACAAAC GAAAAAGGTA 72 6 0 

TTAGAAAGTA CAAACCTAAA TATAGATGAT TTTGTTGATG ATCCTTTAAG TTATGTT AAG 7320 

20 

ACGCCGAGTA ATAAAGTGTT AGGATTTTAT TCGAATAATG CAAATACTAA TGCTTTTAGA 73 80 

CCGGGTGGAG CCCAACAATT AAATGAATAT CAATT AAG T C AATTATTTAC TG AT CAAAAA 744 0 

25 TTACAAGAAG CAGCAAGAAC TAGAAACCCA ATAAGATTAA TGATTGGTTT CGACTATCCT 7 500 

GATGCTTATG GTAATAGTGA AcTTTAGTTC CTGTTAACTT AACGGTATTA CCTGAAATCC 7 560 

AACATAATAt TaAATTCTTT AAAAATGACG ATACTCAAAA TATTGCTGAA AAACCATTTT 7620 

30 CAAAACAAGC TGGGCATCCA GTTTTCTATG TATATGCAGG TAACCAAGGG AATGCTTCCG 7 68 0 

TGAATTTAGG TGGTAGCGTA ACATCTATTC AACCATTACG TATTAATTTA ACAAGTAATG 7 74 0 

AG AATTTT A C AG A T AAAG AT TGG CAAATT A CAGGTATTCC GCGTACATTA CACATTGAAA 7800 

35 ACTCGACAAA TAGACCTAAT AATGCCAGAG AACGCAATAT TGAA CTTGTT GGTAACTTAT 7 860 

TACCBGGGGA TTACTTTGGA ACGATACGTT TTGGACGTAA AGAACAATTA TTCGAAATTC 7 92 0 

GTGTTAAACC A CAT A CA CCA ACAATTACAA CGACAGCTGA G CAATT AAG A GGTACAGCAT 7 980 

40 

T A C AAAAAGT GCCTGTTAAT ATTTCGGGAA TACCGTTGGA TCCATCGGCA TTGGTTTATT B04 0 

TAGTTGCACC AACAAATCAA ACTACGAATG GTGGTAGTGA GGCAGATCAA ATACCATCTG 8100 

GTTATACGAT ACTTGCGACT GGTACACCTG ATGGGGTGCA TAATACAATT ACTATACGAC 8160 

45 

CGCAAGATTA TGTTGTATTC ATACCACCTG TAGGTAAACA AA TT AG AG CA GTAGTTTATT 82 2 0 

ATAATAAAGT AGTTGCATCT AATATGAGTA ATGCTGTTAC TATTTTGCCA GATGACATTC 82 8 0 

50 CACCAACAAT CAATAATCCT GTTGGAATAA ATGCCAAATA CTATCGAGGC GACGAAkCAA 834 0 

CTTTACAATG GGTGTCTCTG ATAGACATTC TGGTATAAAA AATACAACTA TTACGACATT 84 00 

55 



456 



TACAGGTAGA GTGAGTATGA A7CAGGCAT7 
GACAGaCAAT GTCAATAATA CGACAAATGA 
5 AGG TAAAATT AGTGAAGAT3 C7CATCCGAT 

AGTCAA7CCG ACTGCTGTAT CTAATGATGA 
TAAAAACCAA AA7ATAAGAG G AT AT IT AG C 

10 

TGGTAATGTC ACATTACAT7 ACCGTGATGG 
GATGACATAC GAACCAGTTG TGAAACCTGA 
AACGGTAACG ATTGCTAAAG GACAATCATT 

15 

TTTAAGTAAT GGACAACCTA TTCCAAGTGG 
TATTCCAACT GCACAAGAAG TTAGTCAAAT 

0Q TGCTACAAAT GCGTATCATA AAGATAGTGA 

TGTGAAACAA CCAGAAGGCG ATCAACGTGT 
TGATGAAATC TCAAAAGTAA AACAAGCATT 

25 TGCCGAAGGT GATATTTCAG TTACAAATAC 

AGTAAATATT AATAAAGG7C GATTAACGAA 
7TTCTTGCGT TGGGTTAATT TCC CACAAG A 

30 TGCAAACAGA CCAACAGATG GTGGTTTATC 

TCGTTATGAT GCTACATTAG GTACTCAAAT 
AGCAACAACT ACAGTGCCTG GATTGCGAAA 

35 

AGAAGCTGGC GGAAGACCTA ACTTTAGAAC 
T GATGGTC AA CGTCAATTTA CGTTGAATGG 
CCCTTCAAAC GGTTATGGTG GGCAACCTGT 

40 

TAACTCAACT GTTGTTAACG TAAACGAACC 
TGACCACGTT GTAAAAAGTA ATTCTACACA 

45 GTTATACTTA ACGCCATATG G7CCAAAACA 

AAATACTACT GACGCTATTA ACATTTATTT 
TTCAGTAGGT AATTACACTA ATCATCAAGT 

50 TACAGCGAAT GATAACTTTG GTGTGCAATC 

AGGTACTGTT GATAATAACC ATCAACATGT 
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TAACAGTGAT ATTACATTTA AAGTGTCAGC 8 52 0 

TAGTCAATCT AAACATGTTT CAATTCATGT 8b80 

TGTATTAGGA AATACTGAGA AAGTTGTAGT 864 0 

AAAGCAAAGC ATAATTACTG CCTTTATGAA 8 70 0 

ATCAACTGAT CCAGTAACTG TCGATAATAA 8 76 0 

CTCATCGACA ACGCTTGATG CTACAAATGT 882 0 

ATACCAAACT GTCAATGCTG CTAAAACAGC 8 88 0 

TAGTATTGGT GATATTAAAC AATATTTTAC 8 94 0 

CACATTTACA AATATTACAT CTGATAGAAC 9000 

GAACGCAGGC ACGCAGTTAT ACCATATAAC 906 0 

AGACTTCTAT ATTAGTTTGA AAATCAT CG A 912 0 

ATATCGTACA TCAACATATG ATTTAACTAC 916 0 

TATTAATGCA AATAGAGATG TAATTACGCT 924 0 

ACCTAATGGT GCTAATGTAA GTACTATTAC 93 00 

ATCATTCGCG TCAAACCTAG CTAATATGAA 93 6 0 

TTATACAGTG ACATGGACGA ATGCAAAAAT 942 0 

ATGG7CTGAT GACCATAAAT CTTTAATTTA 94 3 0 

TACGACGAAT GATATTTTAA CAATGTTAAA 954 0 

TAACATTACT GGTAATGAAA AAT CACAAG C 96 CO 

GACTGGTTAT TCACAATCAA ATGCGACAAC 9660 

TCAAGTGATT CAAGTGTTAG ACATCATCAA 972 0 

TACAAATTCA AATACTCGTG CAAACCATAG 97 8 0 

GGCAGCTAAT GGTGcTGGCG CATTTACAAT 984 0 

TAATGCAAGT GATGCAGTTT ATAAAGCACA 9 900 

ATATGTTGAA CATTTAAATC AAAATACAGG 9 96 0 

TGTACCAAGT GACTTAGTGA ATCCAACAAT 1002 0 

G7TCTCAGGT GAAACATTTA CAAATACTAT 10080 

TGTAACTGTA CCAAATACAT CACAAAT7AC 1014 0 

T7CTGCAACG GCACCAAATG TGACATCAGC 10200 
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is 



20 



25 



30 



35 



40 



46 



GTTCAATGTA 


ACAGTGAAAC 


CTTTGCGTGA 


TAAATATCGA 


GTTGGTACTT 


CATCAACGGC 


13320 


TGCTAATCCT 


GTGAGAATTG 


CCAATATTTC 


GAATAATGCG 


ACAGTATCAC 


AAG CTGATCA 


103B0 


AACGACAATT 


ATTAATTCGT 


TAACGTTTAC 


TGAAACAGTA 


CCAAATAGAA 


GTTATGCAAG 


10440 


AGCAAGTGCG 


AATGAAATCA 


CTAGTAAAAC 


AGTTAGTAAT 


GTCAGTCGTA 


CTGGAAATAA 


135G0 


TGCCAATGTg 


CACAGTAACT 


GTTACTTATC 


AAGATGGAAC 


AACATCAACA 


GTGACTGTAC 


10560 


CTGTAAAGCA 


TGTCATTCCA 


GAAATCGTTG 


CACATTCGCA 


TTACACTGTA 


CAAGGCCAAG 


10620 


ACTTCCCAGC 


AGGTAATGGT 


TCTAGTGCAT 


CAGATTACTT 


TAAGTTATCT 


AATGGTAGTG 


10680 


ACATTGCAGA 


TGCAACTATT 


ACATGGGTAA 


GTGGACAAGC 


GCCAAATAAA 


GATAATACAC 


10740 


GTATTGGTGA 


AGATATAACT 


GTAACTGCAC 


ATATCTTAAT 


TGATGGCGAA 


ACAACGCCGA 


10800 


TTACGAAAAC 


AG CAACATAT 


AAAGTAGTAA 


GAACTGTACC 


GAAACATGTC 


TTTGAAACAG 


10B60 


CCAGAGGTGT 


TTTATACCCA 


GGTGTTTCAG 


ATATGTATGA 


TGCGAAACAA 


TATGTTAAGC 


10920 


CAGTAAATAA 


TTCTTGGTCG 


ACAAATG CGC 


AACATATGAA 


TTTCCAATTT 


GTTGGAACAT 


10980 


ATGGTCCTAA 


CAAAGATGTT 


GTAGGCATAT 


CTACTCGTCT 


TATTAGAGTG 


ACATATGATA 


11040 


ATAGACAAAC 


AGAAGATTTA 


ACTATTTTAT 


CTAAAGTTAA 


ACCTGACCCA 


CCTAGAATTG 


11100 


ACGCAAACTC 


TGTGACATAT 


AAAGCAGGTC 


TTACAAACCA 


AGAAATTAAA 


GTTAATAACG 


11160 


TATTAAATAA 


CTCGTCAGTA 


AAATTATTTA 


AAGCAGATAA 


TACACCATTA 


AATGTCACAA 


11220 


AT ATT ACT CA 


TGGTAGCGGT 


TTTAGTTCGG 


TTGTGACAGT 


AAGTGACGCG 


TTACCAAATG 


11280 


GCGGAATTAA 


AGCAAAATCT 


TCAATTTCAA 


TGAACAATGT 


GACGTATACG 


ACGCAAGACG 


11340 


AACATGGTCA 


AGTTGTTACA 


GTAACAAGAA 


ATGAATCTGT 


TGATTCAAAT 


GACAGTGCAa 


11400 


CAGTAACAGT 


GACACCACAA 


TTACAAGCAA 


CTACTGAAGG 


CGCTGTATTT 


ATTAAAGGTG 


11460 


GCGA&GTTT 


TGATTTCGGA 


CACGTAGAAA 


GATTTATTCA 


AAACCCGCCA 


CATGGGGCAA 


11520 


CGGTTGCATG 


GCATGATAGT 


CCAGATACAT 


GGAAGAATAC 


AGTCGGTAAC 


ACTCATAAAA 


11580 


CTGCGGTTGT 


AACATTACCT 


AATGGTCAAG 


GTACGCGTAA 


TGTTGAAGTT 


CCAGTCAAAG 


11640 


TTTATCCAGT 


TGCTAATGCA 


AAGGCGCCAT 


CACGTGATGT 


GAAAGGTCAA 


AATTTGACTA 


11700 


ATGGAACGGA 


TGCGATGAAC 


TACATTACAT 


TTGATCCAAA 


TACAAACACA 


AATGGTATCA 


11760 


CTGCAGCATG 


GGCAAATAGA 


CAACAACCAA 


ATAACCAACA 


AGCAGGCGTG 


CAACATTTAA 


11820 


ATGTCGATGT 


CACATATCCA 


GGTATTTCAG 


CTGCTAAACG 


AGTTCCTGTT 


ACTGTTAATG 


11B30 


TATATCAATT 


TGAATTCCCT 


CAAACTACTT 


ATACGACAAC 


GGTTGGAGGC 


ACTTTAGCAA 


11940 


GTGGTACGCA 


AGCATCAGGA 


TATGCACATA 


TGCAAAATGC 


TACTGGTTTA 


CCAACAGATG 


12000 
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TGAATAAACC GAATGTGGCT AAAGTCGTTA ACGCAAAATA TGACGTCATC TATAACGGAC 12120 

ATACTTTTGC AACATCTTTA CCAGCGAAAT TTGTAGTAAA AGATGTGCAA CCAGCGAAAC 121 BO 

CAACTGTGAC TGAAACAGCG GCAGGAGCGA TTACAATTGC ACCTGGAGCA AACCAAACAG 1224 0 

TGAATACACA TGCCGGTAAC GTAACGACAT ACGCTGATAA ATT AG TT ATT AAACGTAATG 12 3 00 

GTAACGTTGT GACGACATTT ACACGTCGCA ATAATACGAG TCCATGGGTG AAAGAAGCAT 123 60 

CTGCAGCAAC TGTAGCAGGT ATTGCTGGAA CTAATAATGG TATTACTGTT GCAGCAGGTA 12420 

CTTTCAACCC TGCTGATACA ATTCAAGTTG TTGCAACGCA AGGAAGCGGA GAGACAGTGA 12480 

GTGATGAGCA ACGTAGTGAT GATTTCACAG TTGTCGCACC ACAACCGAAC CAAGCGACTA 12540 

CTAAGATTTG GCAAAATGGT CATATTGATA TCACGCCTAA TAATCCATCA GGACATTTAA 12600 

TTAATCCAAC TCAAGCAATG GATATTGCTT ACACTGAAAA AGTGGGTAAT GGTGCAGAAC 12660 

ATAGTAAGAC AATTAATGTT GTTCGTGGTC AAAATAATCA ATGGACAATT GCGAATAAGC 12720 

CTGACTATGT AACGTTAGAT GCACAAACTG GTAAAGTGAC GTTCAATGCC AATACTATAA 12780 

AACCAAATTC ATCAATCACA ATTACTCCGA AAG CAGGTAC AGGTCACTCA GTAAG TAG T A 1284 0 

ATCCAAGTAC ATTAACTGCA CCGGCAGCTC ATACTGTCAA CACAACTGAA ATTGTGAAAG 12900 

ATTATGGTTC AAATGTAACA GCAGCTGAAA TTAACAATGC AG TTCaAGTT GCTAATAAAC 1296 0 

GTACTGCAAC GATTAAAAAT GGCACAGCAA TGCCTACTAA TTTAGCTGGT GGTAGCACAA 13 020 

CGACGATTCC TGTGACAGTA ACTTACAATG ATGGTAGTAC TGAAGAAGTA CAAGAGTCCA 130 60 

TTTTCACAAA AGCGGATAAA CGTGAGTTAA TCACAGCTAA AAATCATTTA GATGATCCAG 1314 0 

TAAGCACTGA AGGTAAAAAG CCAGGTACAA TTACGCAGTA CAATAATGCA ATGCATAATG 13200 

CGCAACAACA AATCAATACT GCGAAAACAG AAGCACAACA AGTGATTAAT AATGAGCGTG 13260 

CAACACCACA ACAAGTTTCT GACGCACTAA CTAAAGTTCG TGCAGCACAA ACTAAGATTG 13320 

ATCAAGCTAA AGCATTACTT CAAAATAAAG AAGATAATAG CCAATTAGTA ACGTCTAAAA 133 30 

ATAACTTACA AAGTTCTGTG AACCAAGTAC CATCAACTGC TGGTATGACG CAACAAAGTA 13440 

TTGATAACTA TAATGCGAAG AAGCGTGAAG CAGAAACTGA AATAACTGCA GCTCAACGTG 13500 

TTATTGACAA TGGCGATGCA ACTGCACAAC AAATTTCAGA TGAAAAACAT CGTGTCGATA 13560 

ACGCATTAAC AGCATTAAAC CAAGCGAAAC ATGATTTAAC TGCAGATACA CATGCCTTAG 13 620 

AGCAAGCAGT GCAACAATTG AATCGCACAG GTACAACGAC TGGTAAGAAG CCGGCAAGTA 13630 

TTACTGCTTA CAATAATTCG ATTCGTGCAC TT CAAAGTG A CTTAACAAGT GCTAAAAATA 13 740 

GCGCTAATGC TATTATTCAA AAGCCAATAA GAACAGTACA AGAAGTGCAA TCTGCGTTAA 13800 
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CTGATAATAG TGCTTTAAAA ACTGCTAAGA CGAAACTTGA TGAAGAAATC AATAAATCAG 13 920 

TAACTACTGA TGGTATGACA CAATCATCAA TCCAAGCATA TGAAAATGCT AAACGTGCGG 13 980 

GTCAAACAGA ATCAACAAAT GCACAAAATG TTATTAACAA TGGTGATGCG ACTGACCAAC 14 040 

AAATTGCCGC AGAAAAAACA AAAGTAGAAG AAAAATATAA TAGCTTAAAA CAAGCAATTG 14100 

CTGGATTAAC TCCAGACTTG GCACCATTAC AAACTGCAAA AACTCAGTTG CAAAATGATA 14160 

TTGATCAGCC AACGAGTACG ACTGGTATGA CAAGCGCATC TATTCCAGCA TTTAATGAAA 14 220 

AACTTTCAGC AGCTAGAACT AAAATTCAAG AAATTGATCG TGTATTAGCC TCACATCCAG 14 2 80 

ATGTTGCGAC AATACGTCAA AACGTGACAG CAGCGAATGC CGCTAAATCA GCACTTGATC 14 340 

AAGCACGTAA TGGCTTAACA GTCGATAAAG CGCCTTTAGA AAATGCGAAA AATCAACTAC 14400 

AACATAGTAT TGACACGCAA ACAAGTACAA CTGGTATGAC ACAAGACTCT ATAAATGCAT 144 60 

ACAATGCGAA GTTAACAGCT GCACGTAATA AGATTCAACA AATCAATCAA GTATTAGCAG 14 52 0 

GTTCACCGAC TGTAGAACAA ATTAATACAA ATACGTCTAC AGCAAATCAA GCTAAATCTG 14 580 

ATTTAGATCA TGCACGTCAA GCTTTAACAC CAGATAAAGC GCCGCTTCAA ACTGCGAAAA 1464 0 

CGCAATTAGA ACAAAGCATT AATCAACCAA CGGATACAAC AGGTATGACG ACCGCTTCGT 14 700 

TAAATGCGTA CAACCAAAAA TTACAAGCA3 CGCGT CAAAA GTTAACTGAA ATT AAT CAAG 14 76 0 

TGTTGAATGG CAACCCAACT GTCCAAAATA TCAATGATAA AGTGACAGAG GCAAACCAAG 14 82 0 

CTAAGGATCA ATTAAATACA GCACGTCAAG GTTTAACATT AGATAGACAG CCAGCGTTAA 14 88 0 

CAACATTACA TGGTGCATCT AACTTAAACC AAGCACAACA AAATAATTTC ACGCAACAAA 14 940 

TTAATGCTGC TCAAAATcAT GctGCGCTTG AAACAATTAA GTCTAACATT ACGGCTTTAA 1500 0 

ATACTGCGAT GACGAAATTA AAAGACAGTG TTGCGGATAA TAATACAATT AAATCAGATC 15 06 0 

AAAATTACAC TGACGCAACA C C AG CT AAT A AACAAGCGTA TGATAATGCA GTTAATGCGG 15120 

CTAAAGGTGT CATTGGAGAA ACGACTAATC CAACGATGGA TGTTAACACA GTGAACCAAA 15180 

AAGCAGCATC TGTTAAATCG ACGAAAGATG CTTTAGATGG TCAACAAAAC TTACAACGTG 15240 

CGAAAACAGA AGCAACAAAT GCGATTACGC ATGCAAGTGA TTTAAACCAA GCACAAAAGA 15300 

ATGCATTAAC ACAACAAGTG AATAGTGcAC AAAACGTGCA AGCAGTAAAT GATATTAAAC 15360 

AAACGACTCA AAGCTTAAAT ACTGCTATGA CAGGTTTAAA ACGTGGCGTT G CT AAT CAT A 15420 

ACCAAGTCGT ACAAAGTGAT AATTATGTCA ACGCAGATAC TAATAAGAAA AATGATTACA 15480 

ACAATGCATA CAACCATGCG AATGACATTA TTAATGGTAA TGCACAACAT CCAGTTATAA 15540 

CACCAAGTGA TGTTAACAAT GCTTTATCAA ATGTCACAAG TAAAGAACAT GCATTGAATG 15600 



460 



20 



35 



40 



45 



EP0 786 519 A2 

ATTTAAATAA TGCACAACGT CAAAACTTAC AATCGCAAAT TAATGGTGCG CAT CAAATTG 15 720 

ATGCAGTTAA TACAATTAAG CAAAATGCAA CAAACTTGAA TAG TG CAATG GGTAACTTAA 15780 

GACAAGCTGT TGCAGATAAA GATCAAGTGA AACGTACAGA AGATTATGCG GATGCAGATA 15 84 0 

CAGCTAAACA AAATGCATAT AACAGTGCAG TTTCAAGTGC CGAAACAATC ATTAATCAAA 15 900 

CAACAAATCC AACGATGTCT GTTGATGATG TTAATCGTGC AACTTCAGCT GTTACTTCTA 15 960 

ATAAAAATGC ATTAAATGGT TATGAAAAAT TAGCACAATC TAAAACAGAT GCTGCAAGAG 16 020 

CAATTGATGC ATTACCACAT TTAAATAATG CACAAAAAGC AGATGTTAAA TCTAAAATTA 16 0B0 

ATGCTGCATC AAATATTGCT GGCGTAAATA CTGTTAAACA ACAAGGTACA GATTTAAATA 1614 0 

CAJtCGATGGg TAACTTGCAA GGTGCAATCA ATGATGAACA AACGACGCTT AATAGTCAAA 16 200 

ACTATCAAGA TGCGACACCT AGTAAGAAAA CAGCATACAC AAATGCGGTA CAAGCTGCGA 16260 

AAGATATTTT AAATAAATCA AATGGTCAAA ATAAAACGAA AGATCAAGTT ACTGAAGCGA 1632 0 

TGAATCAA3T GAATTCTGCT AAAAATAACT TAGATGGTAC GCGTTTATTA GATCAAGCGA 16380 

nCAAaCAGCA AAACAGCAGT TAAATAATAT GACGCATTTA ACAACTGCAC AAAAAACGAA 1644 0 

TTTAACAAAC CAAATTAATA GTGGTACTAC TGTCGCTGGT GTTCAAACGG TTCAATCAAA 16500 

TGCCAATACA TTAGATCAAG CCATGAATAC GTTAAGACAA AGTATTGCCA ACAAAGATGC 16 56 0 

GACTAAAGCA AGTGAAGATT ACGTAGATGC TAATAATGAT AAGCAAACAG CATATAACAA 16 620 

CGCAGTAGCT GCTGCTGAAA CGATTATTAA TGCTAATAGT AATCCAGAAA TGAATCCAAG 16680 

TACGATTACA CAAAAAGCAG AGCAAGTGAA TAGTTCTAAA ACGGCACTTA ACGGTGATGA 16 740 

AAACTTAGCT GCTGCAAAAC AAAATGCGAA AACGTACTTA AACACATTGA CAAGTATTAC 16 300 

AGATGCTCAA AAGAACAATT TGATTAGTCA AATTACTAGT GCGACAAGAG TGAGTGGTGT 16 8 60 

TGAXACTGTA AAACAAAATG CGCAACATCT AG AC CAAG CT ATGGCTAGCT TACAGAATGG 16 920 

TATTAACAAC GAATCTCAAG TGAAATCATC TGAGAAATAT CGTGATGCTG ATACAAATAA 16 9 80 

ACAACAAGAG TATGATAA TG CTATTACTGC AGCGAAAGCG ATTTTAAATA AATCGACAGG 17 04 0 

TCCAAACACT GCGCAAAATG CAGTTGAAGC AGCATTACAA CGTGTTAATA ATGCGAAAGA 17100 

TGCATTGAAT GGTGATGCAA AATTAATTGC AGCTCAAAAC GCAGCGAAAC AACATTTAGG 17160 

TACTTTAACG CATATCACTA CAGCTCAACG TAATGATTTA ACAAATCAAA TTTCACAAGC 17220 

TACAAACTTA GCTGGTGTTG AATCTGTTAA ACAAAATGCG AATAGTTTAG ATGGTGCTAT 172 80 

GGGTAACTTA CAAACGGCTA TCAACGATAA GTCAGGAACA TT AG CGAGCC AAAACTTCTT 17340 

GGATGCTGAT GAGCAAAAAC GTAATGCATA CAATCAAGCT GTATCAGCAG CCGAAACCAT 174 00 



55 



461 



EP0 786 519 A2 

TGTTAATAAT G CG AAA CAT G CATTAAATGG 7ACGCAAAAC TTAAACAATG CGAAACAAGC 17520 

AG CGATTACA GCAATCAATG GCGCATCTGA TTTAAATCAA AAACAAAAAG ATGCATTAAA 17530 

AGCACAAGCT AATGGTGCTC AACGCGTATC TAATGCACAA GATGTACAGC ACAATGCGAC 17640 

TGAACTGAAC ACGGCAATGG GCACATTAAA ACATGCCATC GCAGATAAGA CGAATACGTT 17700 

AGCAAGCAGT AAATATGTTA ATCCCGATAG CACTAAACAA AATGCTTACA CAACTAAAGT 17760 

TACCAATGCT GAACATATTA TTAGCGGTAC GCCAACGGTT GTTACGACAC CTTCAGAAGT 17820 

AACAGCTGCA GCTAATCAA3 TAAACAGCGC GAAACAAGAA TTAAATGGTG ACGAAAGATT 178 80 

ACGTGAAGCA AAACAAAACG CCAATACTGC TATTGATGCA TTAACACAAT TAAATACACC 17940 

TCAAAAAGCT AAATTAAAAG AACAAGTGGG ACAAGCCAAT AGATTAGAAG ACGTACAAAC 18000 

TGTTCAAACA AATGGACAAG CATTGAACAA TGCAATGAAA GGCTTAAGAG ATAGTATTGC 18 060 

TAACGAAACA ACAGTCAAAA CAAGTCAAAA CTATACAGAC GCAAGTCCGA ATAACCAATC 18120 

AACATATAAT AGCGCTGTGT CAAATGCGAA AGGTATCATT AATCAAACTA ACAATCCGAC 18180 

TATGGATACT AGTGCGATTA CCCAAGCTAC AACACAAGTG AATAATGCTA AAAATGGTTT 18 240 

AAACGGTGCT GAAAACTTAA GAAATGCACA AAACACTGCT AAGCAAAACT TAAATACATT 18300 

ATCACACTTA ACAAATAACC AAAAATCTGC CATC T CAT CA CAAATTGATC GTGCAGGTCA 18360 

TGTGAGTGAG GTAACTGCTA CTAAAAATGC AGCAACTGAG TTGAATACGC AAATGGGTAA 18420 

CTTGGAACAA GCTATCCATG ATCAAAACAC AGTTAAACAA AGTGTTAAAT TTACTGATGC 18480 

AGATAAAGCT AAACGTGATG CGTATACAAA TGCGGTAAGC AGAGCTGAAG CAATTCTGAA 18 540 

TAAAACGCAA GGTGCAAATA CGTCTAAACA AGATGTTGAA GCGGCTATTC AAAATGTTTC 18600 

AAGTGCTAAA AATGCATTGA AT GGTGAT CA AAACGTTACA AATGCGAAGA ATGCAGCTAA 18660 

AAATGCATTA AATAACTTAA CGTCAATTAA TAATGCACAA AAA CGTG ACT TAACAACTAA 18720 

AATTGATCAA GCAACAACTG TAGCTGGTGT TGAAGCTGTA TCTAATACGA GTACACAATT 187 BO 

GAAtACAGCG ATGGCTAACT TGCAAAATGG TATTAATGAT AAAACAAATA CACTAGCAAG 18 840 

TGAAAACTAT CATGATGCTG ATTCAGATAA GAAAACTGCT TATACTCAAG CCGTTACGAA 18 900 

CGCAGAAAAT ATTTTAAATA AAAATAGTGG ATCAAATTTA GACAAAACTG CCGTTGAAAA 18 960 

J5 

CGCGTTGTCA CAAGTTGCTA ATGCGAAAGG TGCCCTAAAT GGTAACCATA ATTTAGAGCA 19020 

AG CTAAAT CA AATGCAAACA CTACTATAAA CGGACTTCAA CATTTAACAA CTGCTCAAAA 19080 

so AGATAAATTG AAACAACAAG TGCAACAAGC ACAAAATGTT GCAGGTGTAG ATACTGTTAA 1914 0 

ATCAAGTGCC AACACATTAA ATGGTGCTAT GGGTACGTTA AGAAATAGCA TACAAGATAA 19200 
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TAACAATGCT 


GTTGATAGTG 


CTAATGGTGT 


CATTAATGCA 


ACAAGCAATC 


CAAATATGGA 


19320 




TGCTAATGCA 


ATTAACCAAA 


7CGCTACACA 


AGTGACATCA 


ACGAAAAATG 


CATTAGATGG 


19380 


5 


TACACATAAT 


TTAACGCAAG 


CGAAACAAAC 


AGCAACAAAT 


GCCATCGATG 


GTGCTACTAA 


19440 




CTTAAATAAA 


GCGCAAAAAG 


ATGCGTTAAA 


AGCACAAGTT 


ACAAGTGCGC 


AACGTGTTGC 


19500 




AAATGTAACA 


AGTATCCAAC 


AAACTGCAAA 


TGAACTTAAT 


ACAGCTATGG 


GTCAATTACA 


19560 


10 


ACATGGTATT 


GATGATGAAA 


ATGCAACAAA ACAAACTCAA 


AAATATCGTG 


ACGcTGAACA 


19620 




AAGTAAGAAA 


ACTGCTTATG 


ATCAAGCTGT 


AGCTGCTGCG 


AAAGCAATTT 


TAAATAAACA 


196B0 


15 


AACAGGTTCA 


AATTCAGATA 


AAGCAGCAGT 


TGACCGTGCA 


TTACAACAAG 


TAACAAGTAC 


19740 


GAAAGATGCA 


TTGAATGGTG 


ATGCAAAACT 


GGCAGAAGCG 


AAAGCGGCAG 


CTAAACAAAA 


19800 




CTTAGGCACT 


TTAAACCATA 


TTACGAATGC 


ACAACGTACT 


GACTTAGAAG 


GCCAAATCAA 


19860 


20 


TCAAGCGACG 


ACTGTTGATG 


GCGTTAATAC 


TGTAAAAACA 


AATGCCAATA 


CATTAGACGG 


19920 




CGCAATGAAT 


AGCTTACAAG 


GTTCAATCAA 


TGATAAAGAT 


GCGACATTAA 


GAAATCAAAA 


19980 




TTATCTTGAT 


GCGGATGAAT 


CAAAACGAAA 


TGCATATACG 


CAAGCTGTCA 


CAGCGGCTGA 


20040 


25 


AGGCATTTTA 


AATAAACAAA 


CTGGTGGTAA 


CACATCTAAA 


GCAGACGTTG 


ATAATGCATT 


20100 




AAATGCAGTT 


ACAAGAGCGA AAGcGgCTTT AAATGGTGCT GACAACTTAA GAAATGCGAA 


2O160 




AACTTCAGCA 


ACAAATACGA 


TTGATGGTTT 


ACCTAACTTA 


ACACAATTAC 


AAAAAGACAA 


20220 


30 


CTTGAAGCAT 


CAAGTTGAaC 


AAGCGCAAAA 


TGTAGCAGGT 


GTAAATGGTG 


TTAAAGATAA 


20280 




AGGTAATACG 


TTAAATACTG 


CCATGGGTGC 


ATTACGTACA 


AGTATCCAAA 


ATGATAATAC 


20340 




GACGAAAACA 


AGTCAAAATT 


ATCTTGATGC 


ATCTGACAGC 


AACAAAAATA 


ATTACAATAC 


20400 


35 


TGCTGTAAAT 


AATGCAAATG 


GTGTTATTAA 


TGCAACGAAC 


AATCCAAATA 


TGGATGCTAA 


20460 




TGCGATTAAT 


GGCATGGCAA 


ATCAAGTCAA 


TACAACAAAA 


GCAGCGTTAA 


ATGGTGCACA 


20520 


40 


AAACTTAGCT 


CAAGCTAAAA 


CAAATGCGAC 


GAACACAATT 


AACAACGCAC 


ATGACTTAAA 


20580 


CCAAAAACAA 


AAAGATGCAT 


TAAAAACACA 


AGTTAACAAT 


GCACAACGTG 


TATcTGATGC 


20640 




AAATAACGTT 


CAACACACTG 


CAACTGAATT 


GAACAGTGCG 


ATGACAGCAC 


TTAAAGCAGC 


20700 


45 


TATTGCTGAT 


AAAGAAAGAA 


CAAAAGCAAG 


CGGTAATTAT 


GTCAATGCTG 


ATCAAGAAAA 


20760 




ACGTCAAGCG 


TATGATTCAA 


AAGTGACTAA 


CGCTGAAAAT 


ATCATTAGTG 


GTACACCGAA 


20820 




TGCGACATTA 


ACAGTCAATG 


ACGTAAATAG 


TGCGGCATCA 


CAAGTCAATG 


CGGCTAAAAC 


20880 


50 


AGCATTAAAT 


GGTGATAACA 


ACTTACGTGT 


AGCGAAAGAG 


CATGCCAACA 


ATACAATTGA 


20940 




CGGCTTAGCA 


CAATTGAATA 


ATGCACAAAA 


AGCAAAATTA 


AAAGAACAAG 


TTCAAAGTGC 


21000 
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GAAAGGCTTA 


AGAGATAGTA 


T^GCGAATGA 


AGCAACAATT 


A aa n r a n n t r 




21120 


TGACGCAAGT 


CCAAATAATC 


GTAACGAGTA 


CGACAGTGCA 


GTTACTGCAG 


L,AAAAO\_A-H 1 


2118 0 


CATTAATCAA 


ACATCGAACC 


CAACGATGGA 


ACCAAATACT 


ATTACGCAAG 




T 1 ~> A *"l 


AGTGACAACT 


AAAGAACAGG 


CATTAAATGG 


TGCGCGAAAC 


TTAGCTCAAG 


pta A.p»&p2ia.p 


213 0 0 


TGCGAAAAAC 


AACTTGAATA 


ACTTAACATC 


AATTAACAAT 


GCACAAAAAG 


A -"tyipp-Tt aar 


^1 Jb J 


GCGTAgcATT 


GATGGTGCAA 


CAACAGTAGC 


TGGTGTAAAT 


CAAGAAACTG 


pzv &. hippa 


a 1 4 Z J 


AGAATTAAAT 


AACGCAATGC 


ATAGTTTACA 


AAATGGTATC 


AATGATGAGA 


pnpA.Aa.pfl.2m 


ot ^ an 


ACAAACTCAG 


AAATACCTAG 


ATGCAGAGCC 


AAGTAAGAAA 


TCAGCTTATG 






AAATGCAGCG 


AAAGCAATTT 


TAACAAAAGC 


TAGTGGTCAA 


AATGTAGACA 


A & p. p & p. p a itt 

AAO UALa 1 




TGAACAAGCA 


TTGCAAAATG 


TGAACAGTAC 


GAAGACGGCG 


TTGAACGGTG 


iTrrra a &tt 

M 1 V^V-ljAAA i 1 




AAATGAAGCT 


AAAGCAGCTG 


CGAAACAAAC 


GTTAGGTACA 


TTAACACACA 




21720 


ACAACGTACA 


GCGTTAGACA 


ATGAAATTAC 


ACAAGCAACA 


AATGTTGAAG 


ulbl I AA 1 AL 


2 1 / 90 


AGTTAAAGCC 


AAAGCGCAAC 


AATTAGATGG 


TGCTATGGGT 


CAATTAGAAA 


P A TP A A TT 


2 1 0 4 U 


TGATAAAGAC 


ACGACGTTAC 


AAAGTCAAAA 


TTATCAAGAT 


GCTGATGATG 


r*T haa pp rap 


21900 


TGCTTATTCT 


CAAGCAGTAA 


ATGCAGCAGC 


AACTATTTTA 


AATAAAACAg 


U 1 v_j(j U^jtj I AA 


21960 


TACACCTAAA 


GCAGATGTTG 


AAAG AG CAAT 


GCAAGCTGTT 


ACACAAGCAA 


ATACTGcATT 


J^i J 


AAACGGTATT 


CAmAACTTAG 


ATCGTGCGAA 


ACArGCTGCT 


AACACAGCGA 


TTACAAATGC 


2 2 080 


TTCGGACTTA 


AATACAAAAC 


mAAAAGAAGC 


ATTAAAAgCA CAAGTAACAA GTGCAGGACG 


2214 0 


TGTATCTGCA 


GCAAATGGTG 


ttt; aacat a c 


TGCGACTGAA 


TTAAATACTG 


CGATGACAGC 


1 t r\ r\ 

2 2 Z 0 U 


TTTAAAGCGT 


GCCATTGCTG 


ATAAAGCTGA 


GACAAAAGCT 


AGTGGTAACT 


ATGTCAATGC 


"5 "5 ^ C 


TGATGCGAAT 


AAA CGT CAAG 


CATATGATGA 


AAAAGTTACA 


GCTGCCG AAA 


ATATCGTTAG 


ZZ S Z\) 


TGGTACACCA 


ACACCAACGT 


TAACACCAGC 


AGATGTTACA 


AATGCAGCAA 


CGCAAGTAAC 


2 Z J 0 U 


GAATGCTAAG 


ACGCAGTTAA 


ACGGTAATCA 


TAATTTAGAA 


GTAGCGAAAC 


AAAATGCTAA 


Z z 4 4 U 


CACTGCAATT 


GATGGTTTAA 


CTTCTTTAAA 


TGGTCCGCAA 


AAAGCAAAAC 


TTAAAGAACA 


Z Z b uu 


AGTGGGTCAA 


GCGACGACGT 


TGCCAAATGT 


TCAAACTGTT 


CGTGATAATG 


CACAAACATT 




AAACACTGCA 


ATGAAAGGTC 


TACGAGATAG 


CATTGCGAAT 


GAAGCAACGA 


TTAAAGCAGG 




TCAAAACTAC 


ACAGATGCAA 


GTCAAAACAA 


ACAAACTGAC 


TACAACAGTG 


CAGTCACTGC 


22680 


AGCAAAAGCA 


ATCATTGGTC 


AAACAACTAG 


TCCATCAATG 


AATGCGCAAG 


AAATTAATCA 


22740 


AGCGAAAGAC 


CAAGTGACAG 


CTAAACAACA 


AGCGTTAAAC 


GGTCAAGAAA 


ACTTAAGAAC 


22800 
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AGATGCAGTG 


AAACGTCAAA 


TCGAAGGTGC 


AACGCATGTT 


AATGAAGTAA 


CACAAGCACA 


22920 




AAATAATGCG 


GATGCaTTAA 


ATACAGCTAT 


GACGAACTTG 


AAAAATGGTA 


TTCAAGATCA 


22980 


5 


GAATACGATT 


AAGCAAGGTG 


TTAACTTCAC 


TGATGCCGAC 


GAAGCGAAAC 


GTAATGCATA 


23040 




TACAAATGCA 


GTGACGCAAG 


CTGAACAAAT 


TTTAAATAAA 


GCACAAGGTC 


CAAATACTTC 


23100 




AAAAGACGGT 


GTCGAAACTG 


CGTTAGAaAA 


TGTACAACGT 


GCTAAAAACG 


AATTGAACGG 


23160 


10 


TAATCAAAAT 


GTTGCGAACG 


CTAAGACAAC 


TGCGAAAAAT 


GCATTGAATA 


ACCTAACATC 


23220 




AATTAATAAT 


GCACAAAAAG 


AAGCATTGAA 


ATCACAAATT 


GAAGGTGCGA 


CAACAGTTGC 


23290 


15 


AGGTGTAAAT 


CAAGTGTCTA 


CAACGGCATC 


TGAATTAAAT 


ACAGCAATGA 


GCAACTTACA 


23340 


AAATGGTATT 


AATGATGAAG 


CAGCTACAAA 


AGCAGCGCTT 


AATGGTACTC 


AAAACCTTGA 


23400 




AAAAGCTAAA 


CAACACGCAA 


ATACAGCAAT 


TGACGGTTTA 


AGC CATTTAA 


CAAATGCACA 


23460 


20 


AAAAGAGGCA 


TTAAAACAAT 


TGGTACAACA 


ATCGACTACT 


GTTGCAGAAG 


CACAAGGTAA 


23520 




TGAGCAAAAA 


GCAAACAATG 


TTGATGCAGC 


AATGGACAAA 


TTACGTCAAA 


GTATTGCAGA 


23580 




TAATGCGACA 


ACAAAACAAA 


AC CAAAATTA 


TACTGATGCA 


AGTCAGAATA 


AAAAGGATGC 


23640 


25 


GTACAATAAT 


GCTGTCACAA 


CTGCACAAGG 


TATTATTGAT 


CAAACTACAA 


GTCCAACTTT 


23700 




AGATCCGACT 


GTTATCAATC 


AAGCTGCTGG 


ACAAGTAAGC 


ACAACTAAAA 


ATGCATTAAA 


23760 




TGGTAATGAA 


AACCTAGAGG 


CAG CGAAACA 


ACAAGCGTCA 


CAATCATTAG 


GTTCATTAGA 


23820 


30 


TAACTTAAAT 


AATGCGCAAA 


AACAAACAGT 


TACTGATCAA 


ATTAATGGCG 


CGCATACTGT 


23880 




TGATGAAGCA 


AATCAAATTA 


AGCAAAATGC 


GCAAAACTTA 


AATACAGCGA 


TGGGTAACTT 


23940 




GAAACAAGCG 


ATAGcTGACA 


AAGATGCTAC 


GAAAGCGACA 


GTTAACTTCA 


CTGATGCAGA 


24000 


35 


TCAAGCAAAA 


CAACAAG CAT 


ATAACaCTGC 


TGTTACAAAT 


GCTGAAAATA 


TCATTTCAAA 


24060 




AGCTAATGGC 


GGCAATGCAA 


CACAAGCTGA 


AGTTGAACAA 


GCAATCAAAC 


AAGTTAATGC 


24120 


40 


TGCAAAACAA 


GCATTAAATG 


GTAATGCCAA 


CGTTCAACAT 


GCAAAAGACG 


AAGCAACAGC 


24180 


ATTAATTAAT 


AGCTCTAATG 


ACCTTAACCA 


AGCACAAAAA 


GACGCATTAA 


AA CAACAAG T 


24240 




TCAAAATGCA 


ACTACTGTAG 


CTGGTGTAAA 


CAATGTTAAA 


CAAACAGCAC 


AAGAGTTAAA 


24300 


45 


CAATGCTATG 


ACACAATTAA 


AACAAGGCAT 


TGCAGATAAA 


GAACAAACAA 


AAGCTGATGG 


24360 




TAACTTTGTC 


AATGCAGATC 


CTGATAAGCA 


AAATGCATAT 


AATCAAGCAG 


TAGCGAAAGC 


24420 




TGAAGCATTA 


ATTAGTGctA 


CGCCTGATGT 


TGTCGTTACA 


CCTAGCGAAA 


TTACTGCAGC 


24480 


50 


GTTAAATAAA 


GTTACGCAAG 


CTAAAAATGA 


TTTAAATGGT 


AATACAAACT 


TAGCAACGGC 


24540 




GAAACAAAAT 


GTTCAACATG 


CTATTGATCA 


ATTGCCAAAC 


TTAAACCAAG 


CGCAACGTGA 


24600 
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AGCGGCGACA 


ACGCTTAATG 


ACGCGATGAC 


ACAATTGAAA 


CAAGGTATTG 


CGAATAAAGC 


24720 




ACAAATTAAA 


GGTAGCGAGA 


ACTATCACGA 


TGCTGATACT 


GACAAGCAAA 


CAG CAT ATG A 


24780 


5 


TAATGCAGTA 


ACAAAAGCAG 


AAGAATTGTT 


AAAACAAACA 


ACAAATCCAA 


CAATGGATCC 


24840 




AAATACAATT 


CAACAAGCAT 


TAACTAAAGT 


GAATGACACA 


AATCAAGCAC 


TTAACGGTAA 


24900 




TCAAAAATTA 


GCTGATGCCA 


AACAAGATGC 


TAAGACAACA 


CTTGGTACAC 


TAGATCATTT 


24960 


10 


AAATGATGCT 


CAAAAACAAG 


CGCTAACAAC 


TCAAGTTGAA 


CAAGCACCAG 


ATATTGCAAC 


25020 




AGTTAATAAT 


GTTAAGCAAA 


ATGCTCAAAA 


TCTGAATAAT 


GCTATGACTA 


ACTTAAACAA 


25080 


15 


TGCATTACAA 


GATAAAACTG 


AGACATTAAA 


TAGCATTAAC 


TTTACTGATG 


CAGATCAAGC 


25140 


TAAGAAAGAT 


GCTTATACTA 


ATGCGGTTTC 


ACATGCAGAA 


GGTATTTTAT 


CTAAAGCAAA 


25200 




TGGCAGCAAT 


GCAAGTCAAA 


CTGAAGTGGA 


ACAAGCGATG 


CAACGTGTGA 


ACGAAGCGAA 


25260 


20 


ACAAGCATTG 


AATGGTAATG 


ACAATGTACA 


ACGTGCAAAA 


GATGCAGCGA 


AACAAGTGAT 


25320 




TACAAATGCA 


AATGATTTAA 


ATCAAGCAAT 


GACACAATTG 


AAACAAGGTA 


TTGCAGATAA 


25380 




AGACCAAACT 


AAAGCAAATG 


GTAACTTTGT 


CAATGCTGAT 


ACTGATAAGC 


AAAATGCTTA 


25440 


25 


CAACAATGCG 


GTAGCACATG 


CTGAACAAAT 


AATTAGTGGT 


ACACCAAATG 


CAAACGTGGA 


25500 




TCCACAACAA 


GTGGCTCAAG 


CGTTACAACA 


AGTGAATCaA 


GCTAAGGGTG 


ATTT AAA CGG 


25560 




TAACCATAAC 


TTACAAGTTG 


CTAAAGACAA 


TGCAAATACA 


GCCATTGATC 


AGTTACCAAA 


25620 


30 


CTTAAATCAA 


CCACAAAAAA 


CAGCATTAAA 


AGACCAAGTG 


TCGCATGCAG 


AACTTGTTAC 


25680 




AGGTGTTAAT 


GCTATTAAGC 


AAAATGCTGA 


TGCGTTAAAT 


AATGcAATGG 


GTACATTGAA 


25740 




ACAACAAATT 


CAAGCGAACA 


GTCAAGTACC 


ACAGTCAGTT 


GACTTTACAC 


AAGCGGATCA 


25800 


35 


AGACAAACAA 


CAAGCATATA 


ACAATGCGGC 


TAACCAAGCG 


CAACAAATCG 


CAAATGGCAT 


25860 




ACCAACACCT 


GTATTGACGC 


CTGATACAGT 


AACACAAGCA 


GTGACAACTA 


TG AATCAAG C 


25920 


40 


GAAAGATGCA 


TTAAACGGTG 


ATGAAAAATT 


AGCACAAGCG 


AAACAAGAAG 


CTTTAGCAAA 


25980 


TCTTGATACG 


TTACGCGATT 


TAAATCAACC 


ACAACGTGAT 


GCATTACGTA 


ACCAAATCAA 


26040 




TCAAGCACAA 


GCGTTAGCTA 


CAGTTGAACA 


AACTAAACAA 


AATGCACAAA 


ATGTGAATAC 


26100 


45 


aGCaATGAGT 


AACTTGAAAC 


aAGGTATTGC 


aAACAAAGAT 


ACTGTCAAAG 


CAAGTGAGAA 


26160 




CTATCATGAT 


GCTGATG CCG 


ATAAGCAAAC 


AG CAT ATA CA 


AATGCAGTGT 


CTCAAGCGGA 


26220 




AGGTA7TATC 


AATCAAACGA 


CAAATCCAAC 


GCTTAACCCA 


GATGAAATAA 


CACGTGCATT 


26280 


50 


AACTCAAGTG 


ACTGATGCTA 


AAAATGGCTT 


AAACGGTGAA 


GCTAAATTGG 


CAACTGAAAA 


26340 




GCAAAATGCT 


AAAGATGCCG 


TAAGTGGGAT 


GACGCATTTA 


AACGATGCTC 


AAAAACAAGC 


26400 
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AGCAACGAGC CTAGATCAAG CAATGGATCA ATTATCACAA GCTATTAATG ATAAAGCTCA 26 5 20 

AACATTAGCG GACGGTAATT ACTTAAATGC AGATCCTGAC AAACAAAATG CGTATAAACA 2 65 80 

GGCAGTAGCA AAAGCTGAAG CATTATTGAA TAAACAAAGT GGTACTAATG AAGTACAAGC 2 6 640 

ACAAGTTGAA AGCATCACTA ATGAAGTGAA CGCAGCGAAA CAAGCATTAA ATGGTAATGA 2 6 700 

CAATTTGGCA AATGCAAAAC AACAAGCAAA ACAACAATTG GCGAACTTAA CACACTTAAA 26 7 60 

TGATGCACAA AAACAATCAT TTGAAAGTCA AATTACACAA GCG CCACTTG TTACAGATGT 26 820 

CACTACGATT AATCAAAAAG CACAAACGTT AGATCATGCG ATGGAATTAT TAAGAAATAG 26 88 0 

TGTTGCGGAT AATCAAACGA CATTAGCGTC TGAAGATTAT CATGATGCAA CTGCGCAAAG 26 94 0 

ACAAAATGAC TATAACCAAG CTGTAACAGC TGCTAATAAT ATAATTAATC AAACTACATC 27 000 

GCCTACGATG AATCCAGATG ATGTTAATGG TGCAACGACA CAAGTGAATA ATACGAAAGT 27 060 

TGCATTAGAT GGTGATGAAA ACCTTGCAGC AGCTAAACAA CAAGCAAACA ACAGACTTGA 2712 0 

TCAATTAGAT CATTTGAATA ATGCGCAAAA GCAACAGTTA CAATCACAAA TTACGCAATC 271 BO 

ATCTGATATT GCTGCAGTTA ATGGTCACAA ACAAACAGCA GAATCTTTAA ATACTGCGAT 27240 

GGGTAACTTA ATTAATGCGA TTGCAGATCA TCAAGCCGTT GAACAACGTG GTAACTTCAT 27300 

CAATGCTGAT ACTGATAAAC AAACTGCTTA TAATACAGCG GTAAATGAAG CAGCAGCAAT 2 7360 

GATTAACAAA CAAACTGGTC AAAATGCGAA CCAAACAGAA GTAGAACAAG CTATTACTAA 2 7420 

AGTTCAAACA ACACTTCAAG CGTTAAATGG AGACCATAAT TTACAAGTTG CTAAAACAAA 2 7480 

TGCGACGCAA GCAATTGATG CTTTAACAAG CTTAAATGAT CCTCAAAAAA CAGCATTAAA 2 7540 

AGACCAAGTT ACAGCTGCAA CTTTAGTAAC TGCAGTTCAT CAAATTGAAC AAAATGCGAA 276 00 

TACGCTTAAC CAAGCAATGC ATGGTTTAAG A CAG AG CATT CAAGATAACG CAGCAACTAA 2 7660 

AGCAAATAGC AAATATATCA ACGAAGATCA ACCAGAGCAA CAAAACTATG AT CAAGCTGT 2 7720 

TCAAGCCGCA AATAATATTA TCAATGAACA AACTGCAACA TTAGATAATA ATGCGATTAA 27 7 80 

TCAAGCAGCG ACAACTGTGA ATACAACGAA AGCAGCATTA CATGGTGATG TGAAGTTACA 27 84 0 

AAATGATAAA GATCATGCTA AGCAAACGGT TAGTCAATTA GCACATCTAA ACAATGCACA 27900 

AAAACATATG GAAGATACGT TAATTGATAG TGAAACAACT AGAACAGCAG TTAAGCAAGA 2 796 0 

TTTGACTGAA GCACAAGCAT TAGATCAACT TATGGATGCA TTACAACAAA GTATTGCTGA 2 8020 

CAAAGATGCA ACACGTGCGA GCAGTGCATA TGTCAATGCA GAACCGAATA AAAAACAATC 2 8 080 

CTATGATGAA GCAGTTCAAA ATGCTGAGTC TATCATTGCA GGATTAAATA ATCCAACTAT 2 814 0 

CAATAAAGGT AATGTATCAA GTGCGACTCA AGCAGTAATA TCATCTAAAA ATG CATTAGA 2 82 00 
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TCAATTAACA CCAGCTCAAC AACAAGCGCT AGAAAATCAA ATTAATAATG CAACAACTCG 2 3320 

TGATAAAGTG GCTGAAATCA TTGCACAAGC GCAAgCATtA AATGAAGCGA TGAAAGCATT 283 8 0 

AAAAGAAAGT ATTAAGGATC AACCACAAAC TGAAGCAAGT AGTAAATTTA TTAACGAGGA 2 3440 

TCAAGCGCAA AAAGATGCTT ATACGCAAGC AGTACAACAC GCGAAAGATT TGATTAACAA 2 3 500 

AACAACTGAT CCTACATTAG CTAAATCAAT CATTGATCAA GCGACACAGG CAGTGACAGA 2 3560 

TGCTAAAAAC AATTTACATG GTGATCAAAA ACTAGCTCAA GATAAGCAAC GTGCAACAGA 2862 0 

AACGTTAAAT AACTTGTCTA ACTTGAATAC ACCACAACGT CAAGCACTTG AAAATCAAAT 28680 

TAATAATGCA GCAACTCGTG GCGAAGTAGC ACAAAAATTA ACTGAAGCAC AAGCACTTAA 2 8740 

CCAAGCAATG GAAGCTTTAC G T AAT AG CAT TCAAGATCAA CAGCAAACGG AAGCGGGTAG 2 8800 

CAAGTTTATC AATGAAGATA AaCCaCmAAA AGrTGCTTAC CAAGCAGCAG TTCAAAATGC 2 886 0 

AAAAGATTTA ATTAATCAAA CTAACAATCC AACGCTTGAT AAAGCACAAG TTGAACAATT 2 8 920 

GACACAAGCT GTTAACCAAG CTAAAGATAA CCTACACGGT GATCAAAAAC TTGCAGACGA 28980 

TAAACAACAT GCGGTTACTG ATTTAAATCA ATTAAATGGT TTGAAT AAT C CG CAACGT CA 2 9040 

AGCACTTGAA AGCCAAATAA ACAACGCAGC AACTCGTGGC GAAGTAGCAC AAAAATTAGC 29100 

TGAAGCAAAA GCGCTTGATC AAGCAATGCA AGCATTACGT AATAGTATTC AAGATCAACA 2 9160 

ACAAACAGAA TCTGGTAGCA AGTTTATCAA TGAAGATAAA CCGCAAAAAG ATGCTTACCA 29220 

AGCAGCAGTT CAAAATGCAA AAGATTTAAT 7AACCAAACA GGTAATCCAA CACTCGACAA 2 9280 

ATCACAAGTA GAACAATTGA CACAAGCAGT AACAACTGCA AAAGATAATC TACATGGTGA 2 9340 

TCAAAAACTT GCTCGTGATC AACAACAAGC AGTAACAACT G T AAATG CAT TGCCAAACTT 2 9400 

AAATCATGCA CAACAACAAG CATTAACTGA TGCTATAAAT GCAGCGCCTA CAAGAACAGA 2 9460 

GGTTSCACAA CATGTTCAAA CTGCTACTGA ACTTGATCAC GCGATGGAAA CATTGAAAAA 29520 

TAAAGTTGAT CAAGTGAATA CAGATAAGGC TCAACCAAAT TACACTGAAG CGTCAACTGA 2 9530 

TAAAAAAGAA GCAGTAGATC AAGCGTTACA AGCTGCAGAA AGCATTACAG ATCCAACTAA 29640 

TGGTTCAAAT GCGAATAAAG ACG CTG TAG A CCAAGTATTA ACTAAGCTTC AAGAAAAAGA 2 9700 

AAATGAGTTA AATGGTAATG AGAGAGTCGC TGAAGCTAAA ACACAAGCGA AACAAACTAT 2 9760 

TGACCAATTA ACACATTTAA ATGCTGATCA AATTGCAACT GCTAAACAAA ACATTGATCA 2 9820 

AGCGACGAAA CTTCAACCAA TTGCTGAATT AGTAGATCAA GCAACGCAAT TGAATCAATC 2 9880 

TATGGATCAA TTACAACAAG CAGTTAATGA ACATGCTAAC GTTGAGCAAA CTGTAGATTA 2 9940 

CACACAAGCA G ATT C AG AT A AACAAAATGC TTATAAACAA GCTATTGCTG ATGCTGAAAA 30000 
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TGCAAAACAA GCATTAAATG GTGATGAACG TGTAGCACTT GCTAAAACAA ATGGTAAACA 3 012 0 

TG A CAT CG AC CAATTGAATG CATTAAACAA TGCTCAACAA GATGGATTTA AAGGTCGCAT 3 01B0 

CGATCAATCA AACGATTTAA AT C AAATC C A A CAAATT GT A GATGAGGCTA AGGCACTTAA 3 024 0 

TCGTGCAATG GATCAA7TGT CACAAGAAAT CACTGACAAT GAAGGACGCA CGAAAGGTAG 3 0300 

CAC3AACTAT GTCAATGCAG ATACACAAGT CAAACAAGTA TATGATGAAA CGGTTGATAA 3 0360 

AGCGAAACAA GCACTTGATA AATCGACTGG TCAAAACTTA ACTGCAAAAC AAGTTATCAA 3 0420 

ATTAAATGAT GCAGTCACTG CAGCTAAGAA AG CATTAAAT GGTGAAGAAA GACTTAATAA 3 0480 

TCGTAAAGCT GAAGCATTAC AAAGATTGGA TCAATTAACA CATCTAAACA ATGCTCAAAG 3 0540 

ACAATTAGCA ATCCAACAAA TTAATAATGC TGAAACGCTA AATAAAGCAT CTCGAGCAAT 3 06 00 

TAATAG AG CA ACTAAATTAG ATAATGCAAT GGGTTCAGTA CAACAATATA TTGACGAACA 3 0660 

GCACCTTGGT GTTATCAGCA GCACAAATTA CATCAATGCA GATGACAATT TGAAAGCAAA 3 0720 

TTATGATAAT GCAATTGCGA ATG CAGCACA TGAGTTAGAT AAAGTG CAAG GTAATGCAAT 3 07 BO 

TGCaAAAGCT GAAGCAGAGC AATTGAAACA AAATATTATC GATGCTCAAA ATGCATTAAA 3 0 840 

TGGAGACCAA AACCTTGCAA ATGCCAAAGA TAAAGCAAAT GCGTTTGTTA ATTCGTTAAA 3 0 900 

TGGATTAAAT CAACAGCAAC AAGATCTTGC ACATAAAGCA ATTAACAATG CCGATACTGT 3 0 960 

ATCAGATGTA ACAGATATTG TTAATAATCA AATTGACTTA AATGATGCAA TGGAAACATT 31020 

GAAACATTTA GTTGACAATG AAATTCCAAA TGCAGAGCAA ACTGTCAATT ACCAAAACGC 3108 0 

TGACGATAAT GCTAAA 31096 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2243 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

ATGACAGAAT GGGAGCGAGG A CTT AG AATG TTTCCTAAAT CAGGTTTATT AAATTTTGAG 6 0 

TT AG CG A TAG mAAATCGTTC ATTAAATGAT GATGAAAAAG CATTAAAATA TGTGCGTAAA 12 0 

GCATTAAATG CAGACCCTAA AAATACAGAT TATATTAACT TAGAAAAAGA GTTGACTAAA 1B0 

TCAAATGAGT CGAAAAATAA ATAACTTTTA TGATGTACAA CAGTTATTGA AAAGTTACGG 24 0 

ATTTCTAATA TATTTTAAAA ATCCAGAAGA TATGTACGAA ATGATTCAAC AGGAGATTTC 3 00 
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TAATCAGAGA AGGAATGAAC AGAAATGACA AAAATTATTT TAGCAGCTGA TGTAGGCGGG 420 

ACGACTTGTA AATTAGGTAT TTTCACACCT GAATTAGAAC AATTACATAA ATGGTCTATT 4 80 

CACACTGATA CATCTGATAG TACAGGATAT ACACTTTTGA AAGGAATTTA TGATTCGTTT 54 0 

GTTGAAAAAG TAAATGAAAA TAATTATAAT TTTTCAAATG TACTTGGCGT AGGTATTGGT 6 DC 

GTACCAGGTC CTGTTGACT7 TGAAAAAGGT ACAGTAAATG GAGCAGTAAA CTTATATTGG 66 0 

CCAGAAAAAG TTAATGTACG TGAGATTTTT GAACAATTCG TTGATTGTCC AGTGTATGTA 72 0 

GATAATGATG CTAACATAGC TGCTTTAGGG GaGAAACACA AAGGTGCTGG TGAAGGTGCC 78 0 

GATGATGTTG TTGCCATCAC ACTTGGTACA GGTCTAGGTG GAGGAATTAT TTCCAAATGG 84 0 

TGAAATCGTA CATGGTCATA ATGGCTCt GG CGCAGAAATA GGTCATTTTA GAgCAGACTT 90 0 

CgATCAACGA TTTaAATGTA ATTGTGGTCG TTCTGGATGT ATTGAAACAG TTGCTTCaGC 960 

GACAGGCGTT GTTAACTTAG TTAACTTCtA CTATCCGAAG TTGACGTTTA GATCTTCTAT 102 0 

ATTAGAATTG ATTAAAGAAA ATAAGGTtAC aGCAAAAGCT GTTTTTGATG CGGCAAAAGC 10 8 0 

TGGTGACCAA TTCTGTATTT TCATTACTGA AAAGGTTGCA AACTATATTG GATATTTATG 114 0 

TAGTATTATT AGTGTTACAA GTAATCCGAA ATATATCGTT CTAGGTGGAG GAATGTCTAC 12 0 0 

TGCAGGACCT ATTTTAATTG AAAATATTAA AACAGAATAT CATAATTTAA CATTTGCACC 12 6 0 

TGCTCAATTT GAAACTGAAA TTGTACAAGC GAAATTAGGT AATGATGCAG GTATTACAGG 132 0 

AG CAG CAGGA TTAATCAAGA CCTATGTATT AGATAAAGAG GGGGTAAAAT AATGGCTATT 13 80 

GTTGATGTGG TTGTTATTCC AGTTGGAACG GAAGGTCCGA GTGTTAGTAA ATATATTGCA 144 0 

GATATTCAGA AAAAACTTCA AGAATATAAA GCAATGGGTA AAATTGATTT TCAATTAACA 1500 

CCAATGAATA CTCTAATTGA AGGTGAATTA AGCGATGTAT TAGAAGTTGT GCAAGTGATA 1560 

CATGMTTAC CTTTTGATAA AGGTTTAAGT AGAGTTTGTA CAAATATCCG TATTGATGAC 16 20 

CGACGAGACA AATCTAGAAA AATGAATGAT AAACTAACAT CAGTACAAAA ACATTTAGAA 16 80 

AATAGTGGTG AAAACCTATG AGGATTTCAA GCTTAACTTT AGGCTTAGTT GATACTAATA 174 0 

CGTATTTCAT CGAAAATGAC AAAGCTGTTA TTCTGATTGA CCCTTCAGGT G AAAG TG AAA 18 00 

AAATTATTAA AAAATTAAAC CAAATAAATA AACCGTTAAA AGCTATTTTA TTAACACATG 1860 

CACACTTTGA TCATATCGGA GCAGTCGATG ATATAGTTGA TCGATTCGAT GTCCCGGTTT 192 0 

ATATGCATGA AGCAGAGTTT GATTTTCTAA AAGATCCCGT TAAAAATGGG GCAGATAAAT 198 0 

TTAAGCAATA TGGATTACCA ATTATTACAA GTAAGGTAAC TCCTGAAAAG TTAAmCGAAG 2 04 0 

G TAG C A CAG A AATAGAAGGA TTTAAGTTn T nAyrTGTaCA CACACCTGGA CATTCACCAG 2100 
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GAATCGGACG TACAGATTTA TATAAAGGTG ATTATGAAAC GCTAGTTGAT TCTATTCAAG 222 0 

ATAAAATATT TGAATTAGAA GGC 2 24 3 

(2) INFORMATION FOR SEQ ID NO: 61: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8009 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : double 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

IS 

TTGGnATCAT CyAcgGTAAA AAGAATAAaG CAAGATTtAT TTCATTAGTA CTAATTTGTG 6 0 

CAATGTTTGC AATTTGTTGG GTTGCATATA TTCAATGGGA GTCTACAATC GCTTCATTTA 12 0 

CACAATCTAT TAATATTTCa ATGGCACAAT ATAGTGTTTT ATGGACAATT AACGGAATAA 180 

TGATTTTAGT AGCACAACCA TTAATTAAAC CGATTCTCTA TCTGTTAAAA GGAAACTTAA 24 0 

AGAAGCAAAT GTTTGTCGGC ATCATCATTT TTATGTTGTC GTTCTTTGTC A CGAGTTTTG 3 00 

25 CCGAAAACTT TACAATATTT GTTGTCGGTA TGATTATTTT AA CTTTTGG A GAAATGTTTG 360 

TATGGCCAGC AGTTCCAACT AT AG C CAAT C AGTTAGCGCC AGATGGTAAG CAAGGACAGT 4 20 

ACCAAGGTTT TGTGAATTCA GCTGCTACAG TAGGAAAAGC ATTTGGTCCA TTTCTTGGTG 4 80 

30 GTGTATTAGT TGATGCGTTT AATATGCGCA TGATGTTTAT CGGTATGATG CTACTACTTG 54 0 

TATTTGCATT AATATTATTA ATGGTTTTCA AGGAGAATAA TACGCAACCT AAAAAAATAG 6 00 

ATGCATAATG AGTAAATAGA ATTAACGTTA TAGACTTGAA ATAAATGTCG TTATAACATA 660 

35 

ATATTAATTT G T ATAATTT A ATTTCGTTTG GAGCTTTTCT ACAGAAAGCT AG TGATGCTG 720 

AGAGCTAGTG TTAAGGACTA AATGTAAATC GTATTAATTT TAAATTGAAT GAATGACATC 78 0 

TCTTACTATT AAAATGAGTG CACAATTTTT GTGAAATAGG GTGGTAACGC GGCAAATGTC 84 0 

40 

GTCCCTATGT AAATAGAATA GTTAGAGGTG TCTTTTTTAT TGAATAGGAG GAAATGTGTT 900 

GAATTACAAC CACAATCAAA TTGAAAAGAA ATGGcAAGAC TATTGGGACG AAAATAAAAC 960 

45 ATTTAAAACA AATGATAACT TAGGTCAAAA GAAATTTTAT GCTTTAGACA TGTTTCCATA 102 0 

TCCATCAGGT GCTGGTTTAC ATGTTGGACA TCCTGAGGGc TATACAGCAA CAGATATCAT 108 0 

TTCAAGATAT AAAAGAATGC AAGGATATAA TG T ATT A CAT CCGATGGGGT GGGATGCATT 114 0 

50 CGGATTACCA GCAGAGCAAT ATGCTTTAGA CACTGGCAAC GACCCACGTG AATTTACAAA 1200 

GAAAAATATC CAAACTTTTA AACGACAAAT TAAAGAATTA GGGTTCAGTT ATGATTGGGA 12 60 
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GTTATATAAC AAAGGTTTAG CATACGTTGA TGAAGTTGCA GTTAACTGGT GTCCAGCATT 1383 

AGGCACTGTT TTATCTAACG AAGAAGTGAT TGATGGTGTC TCTGAACGTG GTGGACATCC 14 4 0 

5 AG TTT AT CG T AAGCCGATGA AACAATGGGT ACTTAAAATC ACAGAATATG CAGATCAATT 15 GO 

ATTAGCAGAT TTAGATGATT TAGATTGGCC TGAGTCTTTA AAAGATATGC AGCGCAATTG 156 0 

GATTGGACGT TCTGAAGGGG CCAAAGTTTC ATTTGATGTA GATAATACGG AAGGAAAAGT 16 2 0 

w 

AGAAGTATTT ACGACTAGAC CAGATACAAT CTATGGTGCA TCATTCTTAG TCTTAAGTCC 16 8 0 

TGAACATGCA TTAGTTAATT CAATTACAAC AGATGAATAT AAAGAAAAAG TAAAAGCTTA 174 0 

TCAAACAGAA GCTTCTAAAA AGTCAGATTT AGAACGTACA GATTTAGCAA AAGATAAATC 18 00 

15 

AGGTGTATTT ACTGGTGCAT ATGCAACTAA TCCTTTATCT GGTGAAAAAG TACAAATTTG 18 6 0 

GATTGCTGAT TATGTATTAT CAACATATGG TACTGGAGCA ATTATGGCAG TACCAGCGCA 192 0 

2Q TGATGACAGA GATTATGAAT TTGCTAAAAA GTTTGATTTG CCAATCATTG AAGTCATCGA 19 8 0 

AGGTGGAAAT GTTGAAGAAG CAGCATACAC TGGTGAAGGT AAACATATTA ATTCTGGTGA 2 04 0 

ACTTGATGGT TTAGAAAATG AAGCGGCAAT TACTAAAGCT ATTCAATTAT TAGAGCAAAA 2100 

25 AGGTGCTGGC GAAAAGAAAG TTAATTACAA ATTAAGAGAT TGGTTATTCA GTCGTCAGCG 2160 

TTATTGGGGC GAACCAATTC CTGTCATTCA TTGGGAAGAT GGAACAATGA CAACTGTTCC 2220 

TGAAGAAGAG CTACCATTGT TGTTACCTGA AACAGATGAA ATCAAGCCAT CAGGGACTGG 2 2 80 

30 TGAGTCTCCA CTAGCTAATA TTGATTCATT TGTAAATGTT GTAGATGAAA AAACAGGTAT 2 34 0 

GAAAGGACGT CGTGAAACAA ATACAATGCC ACAATGGGCA GGTAGTTGTT GGTATTATTT 24 00 

ACGTTACATC GATCCTAAAA ATGAAAATAT GTTAGCAGAT CCTGAAAAAT TAAAACATTG 24 6 0 

35 

GTTACCTGTT GATTTATATA TCGGTGGAGT AG AACATG CG GTTCTTCACT TATTATATGC 2 520 

AAGATTTTGG CATAAAGTCC TTTATGATTT GGCTATCGTA CCTACTAAAG AACCTTTCCA 2 580 

AAAATTATTT AACCAAGGTA TGATTTTAGG AGAAGGTAAT GAGAAGATGA GTAAATCTAA 2 64 0 

40 

AGGAAATGTA ATCAATCCTG ATGATATAGT ACAGTCTCAT GGTGCAGATA CTTTGCGTCT 2 700 

TTACGAAATG TTTATGGGAC CTTT AG ATG C TGCAATTGCA TGGAGTGAAA AAGGATTAGA 2 76 0 

45 TGGGTCTCGT CGATTCTTAG ATCGCGTATG GCGTTTAATG GTAAATGAAG ATGGGACATT 2 82 0 

GAGTTCAAAA ATTGTAACTA CAAATAATAA ATCTTTAGAT AAAGTTTATA ACCAAACTGT 2880 

TAAAAAGGTA ACAGAAGACT TTGAAACATT AGGATTTAAT ACTGCTATTA GTCAATTAAT 2 94 0 

50 GGTATTTATT AATGAGTGTT ATAAAG TTG A TGAAGTTTAT AAACCTTACA TTGAAGGCTT 3 00 0 

CGTTAAAATG TTAGCACCTA TTGCACCACA TATCGGTGAA GAATTATGGT CAAAATTAGG 3 06 0 
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TGATGAAGTA GAAATCGTTG TTCAAGTGAA TGGTAAATTG AGAGCTAAAA TTAAAATTGC 3180 

TAAAGATACA TCAAAAGAAG AAATGCAAGA AATTGCCTTA TCTAATGACA ATGTTAAAGC 3 24 0 

GAGTATTGAA GGTAAAGACA TCATGAAAGT CATCGCTGTT CCTCAAAAAT TAGTCAATAT 3 300 

TGTAGCTAAA TAATGTTTTA AGGAGGACTT TGAAATGAAG TCAATTACTA CAGATGAATT 3 360 

AAAAAATAAA CTTTTAGAAT CTAAACCAGT TCAAATTGTT GATGTTCGTA CTGATGAAGA 34 20 

AACAGCAATG GGATATATTC CTAATGCAAA GTTAATTCCA ATGGATACCA TTCCGGATAA 34 80 

TTTAAATTCA TTTAATAAAA ATGAAATATA TTATATTGTA TGTGCTGGTG GAGTTCGAAG 3 54 0 

CGCTAAAGTT GTAGAATATT TAGAGGCAAA TGGCATTGAT GCCGTAAATG TCGAAGGCGG 3600 

CATGCACGCA TGGGGCGATG AAGGTTTGGA AATAAAAAGT ATTTAAAGTA GTGACATAAT 3 6 60 

TTAAAATAAT ATTACATTTG TAATGACACC AAGTAACGTT TCGGTTGCTT GGTGTTTTTT 3720 

GGTATGAATT ACTTTCTGTT ACAAAACAAT CTAAAGCGTT CTTGTTATGT TTTATTAAGA 37 8 0 

TTTTAATTAC AAAACGGAAA CTAAATTGTA ATAAAATAAA ACTTTATTTT ATAAAATGAT 3 84 0 

GATGATAAAA TTGAGTGAAC TTAAAATATT GTACAAAATA ATATAGCTAT AAATATAATA 3 900 

25 TAGCTATAAA TATAATATGA GGGAGCGTAT ATTTTTAGCA TAATTCTTAA CAACACAGCA 3960 

GAGAACAGAC AAC CAGGAGG AAAATGAAAT GAATTTGTTA AAGAAAAATA AATATAGTAT 4 02 0 

TAGGAAGTAT AAAGT AG G C A TATTCTCTAC TTTAATCGGA ACAGTTTTAT TACTTTCAAA 40 8 0 

30 CCCAAATGGT GCACAAGCCT TAACTACGGA TAATAATGTA CAAAG CGATA CTAATCAAGC 4140 

AACACCTGTA AATTCACAAG ATAAAGATGT TGCTAATAAT AGAGGTTTAG CAAATAGTGC 4 20 0 

GCAGAATACA CCTAATCAAT CTGCAACAAC CAATCAAGCA ACGAATCAAG CATTGGTTAA 4 26 0 

35 

TCATAATAAT GGTAGTATAG TAAATCAAGC TACGCCAACA TCAGTGCAAT CAAGTACGCC 4320 

TTCAGCACAA AACAATAATC ATACAGATGG CAATACAACA GCAACTGAGA CAGTGTCAAA 4 3 80 

CGCTAATAAT AATGATGTAG TGTCGAATAA T AC CGCATTA AATGTACCAA CTAAAACAAA 44 4 0 

40 

TGAAAATGGT TCAGGAGGAC ATCTAACTTT AAAGGAAATT CAAGAAGATG TTCGTCATTC 4 500 

TTCAAATAAA CCAGAGCTAG TTG CAATTGC TGAACCAGCA TCTAATAGAC CGAAAAAGAG 4 560 

AAGTAGACGT GCGGCACCGG CAGATCCTAA TGCAACTCCA GCAGATCCAG CGGCTGCAGC 4 620 

45 

GGTAGGAAAC GGTGGTGCAC CAGTTGCAAT TACAGCGCCA TATACGCCAA CAACTGATCC 4 68 0 

TAATGCCAAT AATGCAGGAC AAAATGCACC TAACGAAGTG CTGTCATTTG ATGACAATGG 4 74 0 

50 TATTAGACCA AGTACCAACC GTTCTGTGCC AACAGTAAAC GTTGTTAATA ACTTGCCGGG 4 800 

CTTCACACTA ATCAATGGTG G CAAAG TAGG GGTGTTTAGT CATGCAATGG TAAGAACGAG 4 860 
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TCGTATACAT GGAACTGATA CGAATGACCA 
AACAGTAAAT CCGAATTCTG AATTAATCTT 

5 TCAAGGCGCA ACAAATGTTA TTATCAAAAA 

GACTGTTGAA GGCGGTCCAA CTTTGCGTTT 
CAAAATTCAA TTTGTACCTA AAAATGACGC 

W AAAAGATGGT TACAAATACT ATAGCTTTGT 

TGTTTTTGTT GAAAGACGAA CAATGGATCC 
AACAACATCA TTAAAGAATA ATGGTAATTC 

;5 

ATATCAAGTT CAATTACCTG AAGGTG TTG A 
TCCAAGTAAC AATTCAGGCG TTGATGTTAA 
TCGTGTGATA ACAATTAAAA GTACTGGAGG 
GCCTGATAAA ATACTCGATT TAAGATATAA 
AACAGTAACA TTTAACGAGA CATTAACGTA 

25 AGCTGCAGAA AGTCATACTG TAAGTACAAA 

AGATGCATTA CAAGCCGAAG TTGACAGACG 
ATTAGATATC TTTAATGGTC TGAAACGACG 

30 CAATGTACCA TTAAATAAAA GAGTTTCTCA 

G CAACAT ACG TTAATTCGAA GTGTTGATGC 
AATGGAAGAT TTAGTTAATC AAAATGATGA 

35 ACAAGTTATC GAGGAACATA AAAATGAAAT 

TGATSGCGTT ACTAGAATCA AAGATCAAGG 
ACCGGTTGTT AAACCAAATG CTAAAAAAGC 
AATTATCAAT GCAACACCAG ATGCTACTGA 
AGCT A CGG AT GAAACAGATG CTATTGATAA 

45 TGAAACAGCT AAAAATAATG GCATCAATAC 

TAAAAAAGCT GCAAGAGATG CAATTAACCA 
TAGTAATAGA GAAGCAACTC AGGAAGAGAA 

50 AACCAACCAT GCTTTAGAAC AAATCAATCA 

CAAAGGAGAT GGTCTAAATG CCATTAATCC 
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TGGCGATTTT AATGGTATCG AGAAAGCATT 4 980 

TGAATTTAAT ACAATGACTA CTAAAAACGG 504 0 

TGCTGATACT AATGATACGA TTGCTGAAAA 5100 

ATTTAAAGTA CCTGATAATG TGAGAAATCT 5160 

AATAACAGAT GCGCGTGGCA TTTATCAACT 52 2 0 

TGACTCTATC GGACTTCATT CTGGGTCACA 52 8 0 

AACAGCAACA AATAATAAAG AGTTTACTGT 534 0 

TGGTGCTTCT CTAGATACAA ATGACTTTGT 54 00 

ATATGTGAAC AATTCATTGA CTAAAGATTT 54 6 0 

TGATATGAAT GTTACATATG ATGCAGCAAA 552 0 

AGGTACAGCA AACTCTCCGG CACGACTTAT 5 580 

ATTACGTGTA AATAATGTGC CGACACCAAG 564 0 

TAAAA CATAT ACACAAGATT TCATTAATTC 57 00 

TCCATATACT ATCGATATCA TCATGAATAA 576 0 

TATTCAACAA GCTGATTATA CATTTGCGTC 5 82 0 

CGCACAAACG ATTTTAGATG AAAATCGTAA 5 88 0 

AGCATATATT GA7TCATTAA CTAATCAAAT 5 94 0 

TGAAAATGCA GTTAATAAAA AAGTTGACCA 6 000 

ATTGACAGAT GAAGAAAAAC AAGCAGCAAT 6 06 0 

AATTGGTAAT ATTGGTGACC AAACGACTGA 612 0 

TATACAGACC TTAAGTGGGG ATACTGCAAC 618 0 

AATACGTGAT AAAGCAACGA AACAAAGGGA 6 24 0 

AGACGAGATT CAAGATGCAC TAAATCAATT 6 30 0 

TGTTACGAAT GCTACTACAA ATGCTGACGT 636 0 

TATTGGAGCA GTTGTTCCTC AAGTAACTCA 642 0 

AGCAACAGCA ACGAAAAGAC AACAAATAAA 64 8 0 

AAATGCAGCA TTGAACGAAT TAACTCAAGC 6 54 0 

AGCAACAACA AATGCTAATG TTGATAACGC 6 600 

AATTGCTCCT GTAACTGTTG TTAAGCAAGC 6660 
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TGATGCGACT CAAGAAGAAA GACAAGCAGC AATTGACAAA GTGAATGCTG CTGTAACTGC 6730 

AGCAAACACA AACATTTTAA ACGCTAATAC CAATGCTGAT GTTGAACAAG T AAAG A C AAA 6 84 0 

5 TGCGATTCAA GGAATACAAG CAATTACACC AGCTACAAAA GTAAAAACAG ATGCAAAAAA 6 900 

TGCCATCGAT AAAAGTGCGG AAACGCAACA TAATACGATA TTTAATAATA ATGATGCGAC 6 9 60 

GCTCGAAGAA CAACAAGCAG CACAACAATT ACTTGATCAA GCTGTAGCCA CAG CG AAGCA 7 020 

,<? AAATATTAAT GCAGCAGATA CGAATCAAGA AGTTGCACAA G CAAAAG AT C AGGGCACACA 70 80 

AAATATAGTA GTGATTCAAC CGG CAACACA AGTTAAAACG GATACTCGCA ATGTTGTAAA 714 0 

TGATAAAGCG CGAGAGGCGA TAACAAATAT CAATGCT A CA ACTGGCGCGA CTCGAGAAGA 72 00 

75 

GAAACAAGAA GCGATAAATC GTGTCAATAC ACTTAAAAAT AGAGCATTAA CTGATATTGG 7260 

TGTGACGTCT ACTACTGCGA TGGTCAATAG TATTAGAGAC GATGCAGTCA ATCAAATCGG 732 0 

CGCAGTTCAA CCGCATGTAA CGAAGAAACA AACTGCTACA GGTGTATTAA AT G ATTT AG C 7 3 80 

20 

AACTGCTAAA AAGCAAGAAA TTAATCAAAA CACAAATGCA ACAACTGAAG AAAAGCAAGT 74 4 0 

GGCTTTAAAT CAAGTGGATC AAGAGTTAGC AACGGCAATT AATmATATAA ATCAAGCTGA 7500 

25 7ACAAATG CG GAAGTAGATC AAGCGCAACA ATTAGGTACA AAAGCAATTA ATGCGATTCA 7 56 0 

GCCAAATATT GTTAAAAAAC CTGCAGCATT AGCACAAATC AATCAGCATT ATAATGCTAA 762 0 

ATTAGCTGAA AT CAATGCT A CACCAGATGC AACGAATGAT GAGAAAAATG CTGCGATCAA 768 0 

30 TACTTTAAAT CAAGACAGAC AACAAGCTAT TGAAAGTATT AAACAAGCTA ACACAAATGC 774 0 

AGAAGTAGAC CAAGCTGCGA CAGTAGCAGA GAATAATATC GATGCTGTTC AAGTTGATGT 780 0 

AGTAAAAAAA CAAGCAGCGC GAGATAAAAT CACTGCTGAA GTGGcGAacG TATTGaAGCG 786 0 

35 GTTAAACAAA CACCTAATGC AACTGACGAA GAAAAGCAGG CTGCTGTTAA TCAAATCCAA 7 92 0 

TCAACTTTAA AGATTCAAGC AATTTAATCC AAATTTAATC CAAAACCCAA ACAAATGGAT 7 980 

TCAGGGTAGG ACACCACTTA CAAATCCAA 8 0 09 
(2) INFORMATION FOR SEQ ID NO: 62; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10953 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

SO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

ACCCACCCCn TGGGGATAnT TTACCTGGTG GGGCCTTCGA TTGCCTTTAG GTGAAACCaG 6 0 
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AGATGAATGC TAACCATATT CATTCTGCTA AAGATGCTCG TGTTACTGCG A C AG CT G AAA 18O 
IT ATT CAT CG AGGTAAGTCG ACACATGTAT GGGATATAAA AATTAAGAAT GACAAAGAAC 24 0 

AATTAATTAC AGTTATGCGT GGTACAGTTG CTATTAAACC TTTAAAATAA AAGAACTGCT 3 00 

AGCTGAAATG TTATGAGATA TTCATAACTA CGGCTAGCAG TTTTTTTATG CGCTATATTG 3 60 

TTGTAGTTTT AGAAATGCTT GTTCAATGCG TTCGGCAGCT TTACGGCCAC CCATAACATT 4 20 

TCTACCAAAT GGTCCTAATT CTAAGTCTGC AAAGCATCCT GCGACAAATA GATTTGGTAT 4 80 

CCATTCTAAT TTTTCGGAAA TAACAGGGTA ATTACATTCG TTGATAGGTG CATCATAATT 54 0 

TTGTATTAAT TGCTTAATAA GTGGTTGTGA CATAAAATCT TGTTCAAAAC CAGTTGCAAC 60 0 

CATAATCTGT TGATATGGAA CAGAATCATT TTCAGTGTTA ATTACACCAC CACTAATTTG 66 0 

AGTGATAGGT GTTTTATGCa CATTTATACG ACCATTTTTA ATATGTTTTT TAAGGCGTAA 72 0 

GTACAGTTCG TGAGG CATTG ATCCTTTATG ACGTTCGCGT TGTACAATGG CATTTCTTTC 78 0 

AGGCATGCTT TTAGTACTTA AAAATGAAGA CATATTTTTC GGACCTAACC AACCAGGATC 840 
AGCATCAAAG TCATGTATTT CAATATCTTT ATTTAGCCAT AAATGAATCT TTTTATCGTT 90 Q 

AT CATG ATTT AACAATTTAA GTGCAAGATG TGCAGCAGTa ATGCCGCTAC CAACGATATG 96 0 

ATCGGTCTTA TCATATACTA CTTGAT CAAG TTCTTTCTCG AAGATATGAT TTACATTCTG 102 0 

TTTGTCTTTT AAAATGTCAG G CAT AAACGG AATATTTGTA CTGCCTATTG CAATAACG A C 1080 

GCAATCTGTA GTGATAATTT GTCCATCTTC TAACTTGATA TGCCATTTGT CTTCTTGTTT 114 0 

ATCTAAAGTT TGAACTAAAC CTTGAACCAA GCAATCCTCT AATTGATATT GTTTAGAAGC 12 00 

ATGTGCAATA TGATCCATAA ACATTGTCAA TTCAGGTCGT TGATAAGGAC CAT AAAAAG C 12 60 

ATTTGTATAT TGGTGCTGTT TAGCGAATTG TTTTAGATGG AACGGTTGTG GATGTACGTG 13 20 

ATGTACAATC GGTGATCTTA AATAAGGCAT TTCTATTCGA TTTGTATATG AGTTAAACCT 13 80 

TTGGCAAAAA GTTTCGTGTG GGTCAATGAT TGTTAATCGG TCTGTTGTTA ATCCGCTTGA 14 4 0 

TAATAGTTTT TGTGCGATTG CAGTTCCCTG TATGCCACCG CCGATAATTG TC CAATG CAT 150 0 

AATAAAACCT CTCTCTTTTT AAAACGTAAT AGTTACGATT TATAATTATT ATT AT CAT AA 156 0 

TACATAACGA CATGAAAGGC AATTAAATTA AAGAGATATA TGTAGATAGG GCGAATCTGT 162 0 

AGTCAAAGAA AAAAT CATTG AAAAAGAGGT AACAATGTCA AAAGAwAACA GCAGTAAAAT 16 8 0 

CATTCCTAAT TTGGAATCAT CTTACTGCTG TTTGTTGTTG ATTT AT ATT C ATGATTTTGT 174 0 

SO TATATAATCT ACAATTTTGT GTCTTTTAAG TCTTCCGAAA TTTCATCGAC TTTAGTCTTT 18 0 0 

TTAGTATAAG GCGTTTTAAT ATTAT A TGCT GCTTTCATAA TCATATGACT TGAAAGAGGA 186 0 
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GCAATAAAAT ATAAAAACGT ACCAAATAGT AATGACATTG CACCTAATGT TGATGCTTTT 19 80 

CCGGCAGCAT GTGCACGTGA ATATACATCT TCAAGTCTCA ATAATCCTAT AGCTGCTAGG 2C4 0 

5 GCGCTAATTA AAGCACCGAT GATAACAAAG ATAAGTGCAA GACTAATCAG TATGATTTTG 2100 

ATCATGT7CA ATCAGCTTAC CTTTGTCCAT AAATTTAGAG AATACTG CAG TACCTAAAAA 2160 

AGCTAATATA CCAATCATCA TAATAACGAC AATCATGTAT TTAATATTTA ATAAAATACT 2 220 

10 

GAATAATGCT ATAACTGCCA TTAATTGAAG ACCAATCGCA TCTAATGCGA CAACACGATC 22 8 0 

GGCAAGTGAT GGGCCTAGCA CAACGCGAAT GAGCATAGCT AACATAGAAA TGACAACTAT 23 4 0 

GATTAATGCA ATAACGATAA TAACATTATG ATTCATTATA TTTCGCCCAC CTCTCTTACA 24 00 

15 

ATTTTCTCTA ATGATGTTTT AATACTTTCT ACTTCTTGCT CTTTAGTTGA AAAATCTATG 246 0 

GCATGAATAT AAATTTTTGT ACGATCGTCA CTTACACCAA GCACTACAGT ACCAGGTGTT 2 52 0 

2Q AATGTAATTA AATTAGACAG CAAGACAATT TGCCAATCTT TTTTTAAATC TGTGTGATAA 2530 

ACAAAGAATC CTGGTTCATT TTTAATCGAA GGTTTAATAA TAATTTTCAA AACATCAAAA 2 64 0 

TTAGCTTTAA TCAGTTCGAT TAAGAAAATA ATAACTAATT TAATAATACG ATATAGCGTG 270 0 

25 ATGACATAAA ATCTACCTGG TAACACTCTG TGTAAGAGGT AAACAAGAAC TAGGCCAAAG 27 60 

ATGAAACCTA ACACAAAGTT ATTTGTTGTG TAAC TATTTG TCACAAACAA CCAAAACACT 2820 

GCGATAATAA AGTTTAATAC TAATTGTACA GCCATGTTAT TTACCTCCTA ATACAGCTTT 2 3 90 

30 AACGTAGGTT GATGGATTGT AGAATGTTTC TGCACCAGCT TTTACCATTG GATATAAGTA 2 94 0 

ATCTGCTGAC AATCCATATA AAACAGTTAT CACAACTGCA ACGATTGCAA TCGTAGTTAA 3 0 00 

ATATTTGACG TCGACTTTGT TATTAAGATC ATATCCTTTT GGTTGACCGA AAAAGCCTTG 3 0 60 

35 

TAGGAATATG CGAATGACAG AATATAATAC GACTAAACTT GATAATAAGA CGATGACACC 3120 

ACTTAAATAA AATCCTCTTT CAAATGTTGA TTGGACAATA AAAAATTTTC CATAAAAGCC 3180 

ACTGAGTGGG GGAATGCCAG CTAAACTTAA TGCTGCGATA AAGAATGACC AACCAAGTAC 324 0 

40 

AGGATATCGT TTAATTAAGC CACCAAATTG TCTTAAATCA GCAGTGCCTG TAATTTTAAT 3300 

CATAATTCCG ATAAGCAAGA ATAATGCAAG TTTTACTAAC ATGTCGTGCA ATGTATAGTA 3 3 60 

4S AATAGCCCCA ATCATACCTG ACTCTGTCAT CATTGCAACG CCGACTAAGA TCACACCTAC 3 420 

AGCAATCATG ACATTGTATA GGATGATTTT TTTAATGTTG G CAT ATG CAA CAGCACCGAC 34 30 

ACAACCAAAG ATGATCGTTA ATAGTGCTAA GAATAAAATG ACATAATGTG AAAAGCTTAC 3 54 0 

50 ATTATCACTA AAGAATAGGC TCAATGTTCT AGCGATTGCA TAAACACCAA CTTTTGTTAA 3 60 0 

CAAAGCACCA AAGAATGCAA TGATTGGAAT TGGTGGgCAT AGTATGCACT AGGTAACCAA 3 66 0 
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ATATTGACTA AGCCACTGTC ATGCGCTGAA AGGTTAGCTA ATTTATTGCT TATATCTGCT 37 3 0 

AGATTCAATG TTCCTACTAC TGAATATAAA ATCGCTACAC CCATTACGAA GAAGGATGAC 3 64 0 

S GATACAACGT TAACAAGAAC ATATTTTATT GTTTCTTGTA GTTGAATTTT TGTAGAACCA 3 90 0 

ATTACTAATA AGAAATAAGA TGACATTAAA AATACTTCGA AAAATACGAA TAGGTTGAAA 3 96 0 

ATGTCACCAG TTGTGAATGC ACCAATGATA CCTATTAACA TAAATAGTAC TGAAAAATAA 4 02 0 

10 

TAATAATATC TTTCACGTTC AATACCAATT GTTTGGTATG AATATAAAAT CACAATAGCT 4 0 80 

GTAATAATAA TACTAGTAAT TATTAGTAGG GCACTGAATA TG TCTAAT AC AAAGACAATA 414 0 

CTGTATGGTG CTTTCCATGA ACCTAGCTCT ACGCGTATTG GTCCATGTTT AACAACATTT 42 00 

15 

GCTAAATTGA TAATTGCCGC GACCAAGGTT AATAATGTAC CGCCTAGTGC GACATAACGC 4 2 60 

TTTATAATAG GACGCTTTCC AATAAAGACA AGTAATATGG CTGTAATTAC TGGAATAACT 4 3 20 

AGCGTTAACA CAAGCATATT ACTTTCAATC ATCTTCTGGA ACTCCTTTCA TACTCTCAAC 43 8 0 

20 

GTTATCTGTG CCTAATTCTT TATATGTTCT AAATGCTAAT ACTAAGAAAA AGGCTGTTGT 444 0 

CGCAAgGCGA TAACGATTGC TGTTAAAATA AGTGCTTGCG GGaTAGGaTC AACATAGCTT 4 500 

25 TTTACGTTCG CTTCATAAAT TGGAACAGTA CCATGTTTAA GTCCGCCCAT AGTTATTAAA 4 56 0 

AATAAATTTG CTGCATGTGT TAATAGTGTA GTTCCCATAA CAATTCGTAT CAGACTTTTA 4 62 0 

GACAAAACGA GATAGACACT AATTG CTGTG AGAATACCAC TAACAAAAAT CATAATAATT 4680 

30 TCCACTATTC GTTCTCTCCA ATCGAAATAA TAATTGTCAT GACAGTACCA ACTACTGCAC 4 74 0 

AT AAAACA C C GAAATCAAAG AATACTGCTG TTGTCATATG AACAGGTTCT AATATAAATA 4 800 

ACGGTATATC AAATGTGACA TGCGTAAAGA AATTTTTGCC TAAAAACCAA CTTGCGATAG 4860 

35 

GCGTCGCAAT ACAAAAAACT AATCCGATAC CTATCAAGAT TTTAAAATCT AATGGGAAAA 4 92 0 

TTTTACGCAT TGTTTCTATA TCAAATGCAA TCGTAATGAT AACAAGTGAA CTTGCGAATA 4 9 BO 

ATAATCCGCC GACGAAACCG CCACCAGGTG TATAATGTCC TGCTAAGAAA AGTGAAAAAC 504 0 

40 

CAAAGACCAT TACCATGAAA AAGATAATAA CTGCAGCAAA TTGCAAAATT AGATCATTTT 5100 

GTTGTCTATT CATGATTTTT CACCTCGTTA CCTTGCGTTT GACGCTTTTT ACGTAATTTA 5160 

ATCATTGTAT ATACAGCTAA TCCTGCGATA CCAAGCACAG ATGACTCGAA TAAAGTATCC 52 2 0 

45 

ATACCACGGA AATCAACAAG TATGACGTTT ACCATGTTTT TACCGTGAGC t AAATCATAA 5280 

ACGTGCTCTT GATAAAACTT AGATATCGAT TCAAAATGTC TATTTCCGTA TGCAATTAAA 5 34 0 

so CCGATAATAA TGACGGACAA ACCAACACCA CCAGCAATTA AAGCATTAGT AAGCTGGAAT 54 00 

GAGCGCTTTT CATTATAACG ATTTAAATTT GGTAAGTGGT AGAAGCATAA TAAGAACAAT 54 6 0 
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ATAAACAATA CAGACACAGC ATATCCAACT 
GATTTAGCGA AAAGAATTAA AAAGGCAGCA 
5 AAAATTCTAA TCGGACTAAC GTCTTTAAAA 

ACAAATGTTA ATAAAATTAA TGCACCAAAA 
GTAACATAGC TATTCGTCAT CTTTTCAGAG 

10 

TACCAATAAT TGAATGTTAG TTT AC CAGGT 
AATGTCACAA TTAGTAAGAT ACCTAAAATA 
GTTAATCCAT GGAACATATG GAATTCAACA 

IS 

TnAGCTGGTT CAATAATCGA ATTAGTTAAA 
AATGTAGCTA AAATAGCTGG TGATAAAAGC 

2Q GGTAATTGTT CAGGTTTATA TTGTCCGAAA 

ACAAATGTGA AGACACTGCC CACTATACCA 
ACACTGAATA AATTTGCTTG GCTTGCTGTA 

25 AAGAAACCAT TGAACGGTGG TACACCAGCg 
AAATGAAATA GGCATAATTG TTAGTAAGCC 
AGAATGATCC ACTGCACCTG TAATCATAAA 

30 TAAATGGAAT ATTGCAGCCG TAAATGCAGC 
GTGATAACTA ATGGCACCGA TTCCAAGCAT 
TGAAAATGCC AGTATACCTT TCAAGTCTTG 

35 

TAATGTAATT AAACCAACGA GTGTGACAGT 
TGGTGTCATT CGAGCGATTA AATATAACCC 

AGCACTGACT GGTGTAGGTG CTTCCATTGC 

40 

AGCAGATTTT GTAAAAGCAC CAATCATGAT 
ATTTTGAATT TCAGAAGCAT GTTGAATCAT 

4S AGCGAGTAAG ATGATACCAC CTAATAATGA 
TTTTTGAGCA CCATATATAG ATGCTTGTCG 
AGAAAATGAC GTTAGCTCCC AGAATAAATA 

50 ACCTAACATT GCACCCATAA ATAGTAATAA 
CTTACTTAAG TAGCCGATTG AATATAATAC 

55 



GCACTTAACA TAATGATGCT AAATAATCTT 5 58 0 

CTTAATAATA AAATTACGAT ACAAACTTCG 564 0 

TTAATGTTGA AAGGTACTGA GAATATAGTG 5 7 00 

ATGATAACTA AATTATTACG TGAATAATCG 576 0 

TAGTTTGGAA TAACATTTGC ACTTCTGTTG 582 0 

TGTCGTTGCA ACAATTTCAC CCAATAACTA 5 88 0 

TAAATCACTA ATGTTGATAA AAAGGCAGGC 5 94 0 

TCATCAATTA CCGTATGATT AATCGAAGag 6 000 

ATGCCAGGGA ATAAACCAAA TACAATTACT 6 06 0 

ATTAATATTG ATACTTCGTG TGCTTTTTTA 612 0 

AATATATGCA TTATAAATTT AATTGAATAT 6180 

ATGATTGGGA ATAGGTAGCC TAATGTATCA 624 0 

AATGTTGTTT C TAAAAATG A TTCTTTTGAT 63 00 

CATACTTAAT GCTGTAATAA CAGTGATTGT 6360 

ACCTAATTTC TTAACATCAC GTGTACCAGT 64 2 0 

TAGGGCACCT TTAAATGTTG CATGGTTGAT 64 8 0 

AGCATATATT TTGCTATCAT CGCCTTGATA 6 54 0 

CGCCATAATC ATACCTAATT GGGATACTGT 6600 

TTGTTTTGTT GCGTTTAGCG AAgCCCAGAA 666 0 

CCATACCCAA CCTTGCGATG CTGCGAAGAT 672 0 

TGCTTTAACC ATTGTTGCTG AATGAAGATA 67 8 0 

ATCTGGTAGC CAAATATAAA ATGGAAACTG 6 84 0 

TAAAATCATC GCAAAAATGA AGAATGGGCT 690 0 

GTACTGAATG CTAAATGATT GTGTTGGTAT 6 96 0 

TAGACCACCA AATACTGTGA TTATGAGCGA 7020 

TTCGCGCCAG AATGAAATAA GTAAAAAACT 70 30 

TAGAATAATA ACATTATCTG AAAGTACGAC 714 0 

ATAACAATAA AAATTCCCTA GTTGTTCTGA 72 0 0 

TACTAAACTG CCGATTCCTG AAATAAGCAA 72 6 0 
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CCAATTTAAG GTTTTCATTA CAGTATTACC TGACATCGTC GTTTTAATTA ATGTAAGCAT 7 3 30 

ATAAATAAAT ATGACGATAG GGACAGGTAA TACGAACCAT CCTAAATGTA TACGTTTAAA 744 0 

AAATCTATAC AGGATAGGAA TAATGAGTGC GAATATTAAC GGTAATATCA CCGCAATATG 7 5 00 

TAACAAACTC ACTATGTTGT CCTCCTTTAA AAAATATTTA TGTTATTCAT TATACATGAA 75 60 

TGATATAGTT CTGAAAAACG TACACACTCC TTGTTGTGCT TTATTTTCAG AaGTATTTAA 76 2 0 

ATAAGAAGAA ACACGTCATT TTTTATTTAA AATTTTCTTT GTATTGAAGT GAATAATCTT 7 6 80 

CTTTTAAGCG TGCTAAACTA GCTAAAGACA TTTCAGCATG TTTTGTTTGC TGAGCTTTAA 774 0 

GTTTAGTTTC TAAATCTGTA ATTGCTTGTT GAAGTGAATC TTCATAGCGC AATACATCAA 7800 

CATTGAAGTC GCGTAATTGT GAACGTTTCG TATAGCGTTT TTCAAAATGG CTTAATGCTT 7860 

TGCGGTCATG GAAAAATACA CCTTCAGTTT CAGTAGGGTT ATGTAAATCA CCTTGTTTCG 7 92 0 

GGTGTTTGAT AACTTGTTCA ACTTTAACAA GGACATCGTC TCCATTTTCT TCAACAATCG 79 8 0 

TGACACCATA GCTACCTGTT TTGTGTGAAA ATCGATATAG CTTCATGCTA TTTTCCTCCC 8 04 0 

TTAAAAG TAT GTTAATATAT AT GT AT CAT A ACATGAATGG AGAATATAAA TGGCTAACTA 810 0 

TCCACAGTTA AACAAAGAAG TACAACAAGG TGAAATCAAA GTGGTTATGC ACACAAATAA 816 0 

AGGTGACATG ACATTCAAA? TATTTCCAAA TATTGCACCA AAAACAGTTG AAAATTTTGT 8 22 0 

GACACATGCA AAAAATGGTT ATTATGATGG AATCACATTC CACCGTGTCA TTAATGACTT 82 8 0 

CATGATTCAA GGTGGCGATC CAACAGCTAC TGGTATGGGT GGCGAAAGTA TTTATGGCGG 834 0 

TGCTTTTGAA GATGAATTTT CATTAAATGC ATTTAACTTA TATGGCGCAT TATCAATGGC 84 0 0 

TAACTCAGGA CCTAATACTA ATGGTTCACA ATTTTTCATT GTTCAAATGA AAGAAGTACC B4 6 0 

TCAAAATATG TTAAGTCAAC TTGCAGATGG TGGCTGGCCT CAACCAATCG TTGATGCATA B52 0 

TGGCSAAAAG GGTGGTACAC CATGGTTAGA TCAAAAACAT ACAGTATTCG GTCAAATCAT 8 58 0 

TGATGGTGAA aCTACATTAG AAGATATTGC AAATACAAAA GTGGGACCAC AAG AT AAA C C B64 0 

ACTTCATGAT GTTGTAATTG AATCTATTGA TGTTGAAGAA TAATATCTAA ACATAATTAA 8 70 0 

CTACCAACAT TTTAAACTCG GATAAAGCTA ATTTATGAAT GGATTAGTAT ATATTCCAAC 8 7S0 

gAAAATAAAT AAACTAATAT GATGAGCAAT CTCAATATAT TTATCaAGAA AGCACAGTTT 8 82 0 

TTAAATAGAT GTGTATTTTA AAGATAATAG TTGAGGTTGC TTTTTATGTT TTTACAGAGA 9 88 0 

ATTGCTATTC AAATAGTAAA TAAATTGAAA ACAAAGTAGC TGGATATCAT ATTGATTTAG 9 94 0 

ATAGGAATTT GTTGCTAATT TTATTTGTAA ATCCAAGTTT G T AG AATTCT TATTCATTTA 900 0 

TAAAATAATA TTCGTATGAT TTGATTTTTT AATTAGTCCA CCATTTCGAT TTGTGCTATG 9 06 0 
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AACATATCAA GGTGCGTGTA CTGGTATTCA ACCATACGGT GCGTTTGTTG AGACCCCTAA 9190 

TCATACTGAA GGACTGATTC ATATATCAGA AATTATGGAT GACTACGTTC ATAATTTGAA 924 0 

GAAATTTCTA TCAGAAGGCC AAATTGTTAA AGCTAAAATT TTGTCTATAG ATGATGAAGG 9300 

AAAGCTTAAT CTATCATTAA AGGATAATGA TTACTTCAAA AATTATGAGC GTAAGAAGGA 9360 

AAAACAATCA GTATTAGATG AAATCAGAGA AACAGAAAAA TATGGGTTTC AAACACTTAA 94 2 0 

AGAACGCTTA CCAATCTGGA TAAAACAGTC AAAGCGAGCA ATT CG AAA CG ACTAAAGGAA 94 80 

CAGATAAATC GTACCGAAAA TCATACAAAG GGTCTGAAAT GAAAGTTTCT TAGACTATAA 954 0 

AAGAGATTAG TATCTATTAA ATTTTATTAG ATACTAATCT CTTTTTGTCT ACGATAACGT 9600 

AATATGaTTG ATTCTATTTA CACGTACAAA TGGTTTAAGG TGACATATCC ATTATCTTTG 96 6 0 

TTAGATAGAA TCGTTGATTT GCaATATTGT ATGTGGATTT GTTTTTTTTA TTTATTTTAG 972 0 

AAATGAGAAC TACAACTTAA AGTATTAAAC GAATTGCAAC TATATAAACA GATAATTGGA 978 0 

GAATGAAAAA ATTACATGTT ATAGTCAACT CAATAATTTT AAGGAGGAAT TAAGTAATGA 984 0 

AAAGTAAATA CGAACCATTG TTTGATAAAG TAGAATTACC AAATGGAGTA GAGTTGAGAA 9 900 

ATCGATTTGT GTTAGCCCCT TTAACACATA TTTCTTCAAA TGATGATGGT ACTATTTCAG 996 0 

ATGTAGAACT TCCTTATATT GAAAAGCGTT CACAAGATGT TGGTATTACA ATTAATGCTG 1002 0 

CGAGTAATGT GAGTGATGTC GG AAAAG CAT TTCCAGGACA GCCATCAATC GCGCATGACA 10080 

GTAATATTGA AGGACTAAAA CGATTAGCTA CAGCAATGAA GAAAAACGGT GCCAAAGCAC 1014 0 

TCGTACAAAT ACATCATGGC GGTGCACAAG CATTGCCTGA ATTAACACCT GATGGAGACG 10200 

TCGTAGCACC AAGTCCAATT TCTTTAAAAA GTTTTGGTCA GAAACAAGAA CATAGTGCTA 10260 

GAGAAATGAC GAATGAAGAG ATTGAACAAG CAATCAAGGA TTTTGGTGAA GCAACGCGAC 10320 

GTGCAATTGA AGCAGGGTTT GATGGTGTTG AAATACATGG CGCGAATCAT T A CTT AATTC 103 80 

ATCAATTTGT ATCACCATAC TATAATAGAA GAAATGATGT ATGGGCAAAT CAATATAAAT 10440 

TCCCGGTCGC TGTGATTGAA GAAGTACTTA AAGCGAAAGA AGCGTATGGC AATAAAGACT 10 500 

TTATAGTTGG ATACAGATTA TCTCCAGAGG AAGCGGAGTC TCCAGGAATC ACAATGGAAA 10560 

TTACAGAGGA ACTCGTTAAT AAAATTAGCC ATATGCCAAT CGACTATATT CATGTTTCAA 106 20 

TGATGGATAC GCATGCAACG ACACGTGAAG GTAAATACGC TGGACAAGAA AGACTGCCTT 106 8 0 

TAATTCACAA ATGGATAAAT GGTCGTATGC CACTTATCGG TATTGGTTCA ATTTTCACAG 10 740 

CTGACGAAGC TTTAGATGCA GTTGAAAATG TTGGTGTTGA CTTAGTAGCC ATTGGTAGAG 10800 

AG CT AC TACT GGATTATCAA TTTGTTGAAA AAATTAAAGA TGGACGGGAA GATGAAATTA 10860 
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AATTTAATGA AGGGTTTTAT CCATTACCAC GTA 10 953 

(2) INFORMATION FOR SEQ ID NO: 63: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3155 base pairs 
(b) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
{ D) TOPOLOGY: linear 

w 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
TTTGATAnAA AACTGAATnA ATTAAATGTA TCGATTCAAC CTAATGAAGT GAATTTACAA 6 0 

15 

GTTAAAGTAG AGCCTTTTAG CAnAAAGGTT AAAGTAAATG TTAAACAGAA AGGTAGTTTA 12 0 

GCAGATGATA AAGAGTTAAG TTCGATTGAT TTAGAAGATA AAGAAATTGA AATCTTCGGT 180 

20 AGTCGAGATG ACTTACAAAA TATAAGCGAA GTTGATGCAG AAGTAGATTT AGATGGTATT 24 0 

TCAGAATCAA CTGAAAAGAC TGTAAAAATC AATTTwCCAG AACATGTCAC TAAAGCACAA 3 00 

CCAAGTGAAA CGmAGGCTTA TATAAATGTA AAATAAATAG CTAAATTAAA GGAGAGTAAA 3 6 0 

25 CAATGGGAAA ATATTTTGGT ACAGACGGAg TAAGAGGTGT CGCAAACCAA GAACTAACAC 42 0 

CTGAATTGGC ATTTAAATTA GGAAGATACG GTGGCTATGT TCTAGCaCAT AATAAAGGTG 4 80 

AAAAA CACCC ACGTGTACTT GTAGGTCGCG ATACTAGAGT TTCAGGTGAA ATGTTAGAAT 54 0 

in 

CAGCATTAAT AGCTGGTTTG ATTTCAATTG GTGCAGAAGT GATGCGATTA GGTATTATTT 6 00 

CAACACCAGG TGTTGCATAT TTAACACGCG ATATGGGTGC AGAGTTAGGT GTAATGATTT 66 0 

CAGCCTCTCA TAATCCAGTT GCAGATAATG GTATTAAATT CTTTGGATCA GATGGTTTTA 72 0 

35 

AACTATCAGA TGAACAAGAA AATGAAATTG AAGCATTATT GGATCAAGAA AACCCAGAAT 78 0 

TACCAAGACC AGTTGGCAAT GATATTGTAC ATTATTCAGA TTACTTTGAA GGGGCACAAA 84 0 

AATATTTGAG CTATTTAAAA TCAACAGTAG ATGTTAACTT TGAAGGTTTG AAAATTGCTT 90 0 

40 

TAGATGGTGC AAATGGTTCA ACATCATCAC TAGCGCCATT CTTATTTGGT GACTTAGAAG 96 0 

CAGATACTGA AACAATTGGA TGTAGTCCTG ATGGATATAA TATCAATGAG AAATGTGGCT 102 0 

45 CTACACATCC TGAAAAATTA GCTGAAAAAG TAGTTGAAAC TGAAAGTGAT TTTGGGTTAG 10B0 

CATTTGACGG CGATGGAGAC AGAATCATAG CAGTAGATGA GAATGGTCAA ATCGTTGACG 114 0 

GTGACCAAAT TATGTTTATT ATTGGT CAAG AAATGCATAA AAATCAAGAA TTGAATAATG 120 0 

50 ACATGATTGT TTCTACTGTT ATGAGTAATT TAGGTTTTTA CAAAGCGCTT GAACAAGAAG 126 0 

GAATTAAATC TAATAAAACT AAAGTTGGCG ACAGATATGT AGTAGAAGAA ATGCGTCGCG 1320 
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CTGGTGATGG TTTATTAAC7 GGTATTCAAT TAGCTTCTGT AATAAAAATG ACTGGTAAAT 14 4 0 

CACTAAGTGA ATTAGCTGGA CAAATGAAAA AATATCCACA ATCATTAATT AACGTACGCG 150 0 

5 TAACAGATAA ATATCGTGTT GAAGAAAATG TTGACGTTAA AGAAGTTATG ACTAAAGTAG 156 0 

AAGTAGAAAT GAATGGAGAA GGTCGAATTT TAG T AAG AC C TTCTGGAACA aACCATTAGT 162 0 

TCGTGTCATG GTTGAAGCAG CAACTGATGA AGATGCTGAA aGATTTGCAC AACAAATAGC 16 SO 

w 

TGATGTGGTT CAAGATAAAA TGGGATTAGA TAAATAAATA CTGTATTACA AATGAGCCGA 174 0 

TGCGTATGcA nTcgtTTTTT GTGTTTGTAG AAATAATTTA TAGTACAAAC GTAAAATGAT 1800 

ATAAACAAAA TAAAAACAAA GTAATCAATA TGTAATATAA AATA CACTGG TACTCAATAT 1860 

fS 

ATAATGATGA TAAAATTAAT TTTAATTAGA TAGAGTTGCT TTGTGTTTTT AACGCAGATG 192 0 

CTACTACTTA TCTTAACAGT TGATTAAGTG AAATCATTTA ACAGCGAGAA TAATCAACCA 1980 

2Q GGAGGATGAC TTAATGAATT TATTCAGACA ACAAAAATTT AGTATCAGAA AATTTAATGT 204 0 

CGGTATTTTT TCAGCTTTAA TTGCCACTGT TACTTTTATA TCTACTAACC CGACAACAGC 2100 

GTCTGCAGCA GAGCAAAATC AGCCTGCACA AAATCAACCA GCACAACCAG CTGATGCCAA 2160 

25 TACACAGCCT AACGCAAATG CTGGTGCTCA AGCTAATCCT ACAGCACAGC CAGCTGCACC 222 0 

TGCCAACCAA GGACAACCAG CAGTACAACC AGCAAACCAA GGTGGACAGG CTAATCCAGC 22 8 0 

AGGAGGAGCA GCACAACCAA ATACACAACC AGCTGGACAA GGTGATCAAG CTGATCCGAA 2340 

30 TAACGCTGCA CAAGCACAAC CTGGAAATCA AGCAACACCG GCAAACCAAG CAGGTCAAGG 24 00 

AAATAACCAA GCAACACCTA ATAATAATGC AACACCGGCA AATCAAACAC AGCCAGCGAA 2460 

TGCTCCAGCA GCAGCGCAAC CAGCAGCACC TGTAGCAGCA AACGCACAAA CTCAAGATCC 2 520 

35 

AAATGCTAGC AATACTGGTG AAGGCAG T AT TAATACGACA TTAACATTTG ATGATCCTGC 2 58 0 

CATATCAACA GATGAGAATA GACAGGATCC AACTGTAACT GTTACAGATA AAGTAAATGG 264 0 

TTATTCATTA ATTAACAACG GTAAGATTGG TTTCGTTAAC TCAGAATTAA GACGAAGCGA 2 70 0 

40 

TATGTTTGAT AAGAATAACC CTCAAAACTA TCAAGCTAAA GGAAACGTGG CTGCATTAGG 276 0 

TCGTGTGAAT GCAAATGATT CTACAGATCA TGGTAACTTT AACGGTATTT CAAAAACTGT 2 82 0 

^ 5 AAATGTAAAA CCAGATTCAG AATTAATTAT TAACTTTACT ACTATGCAAA CGAATAGTAA 288 0 

GCAAGGTGCA ACAAATTTAG TTATTAAAGA TGCTAAGAAA AATACTGAAT TAGCAACTGT 2 94 0 

AAATGTTGCT AAGACTGGTA CTGCACATTT ATTTAAAGTA CCAACTGATG CTGATCGTTT 3 00 0 

50 AGATTTACAA TTTATTCCTG ACAATACAGC AGTTGCTGAT GCTTCAAGAA TTACAACAAA 3 06 0 

TAAAGATGGT TAT AAA TACT ATT CATT CAT TGATAATGTA GGTCTATTCT CAGGATCACA 312 0 
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TAATACTSAA ATCGGTAACA ATGGTAATTT TGGTGCTTCA TTAAAAGCAG ATCAATTTAA 3 24 0 

ATATGAAGTA ACATTACCAC AAGGTGTAAC TTACGTTAAT AATTCATTAA CTACAACATT 3 3 00 

s CCCTAATGGT AATGAAGACA GTACAGTATT GAAAAATATG ACTGTTAATT ATGATCAAAA 3 36 0 

TGCAAATAAA GTTACATTTA CAAGCCAAGG TGTGACAACG GCACGTGGTA CACACACTAA 342 0 

AGAAGTTTTA TTCCCAGATA AATCTTTAAA ATTATCATAT AAAGTTAATG TTGCGAATAT 34 8 0 

W CGATACACCT AAAAATATTG ATTTTAATGA AAAATTAACA TATCGTACTG CTTCAGATGT 3 54 0 

TGTAATTAAT AATGCGCAAC CAGAAGTaCA CTAACTGCAG ATCCATTTTC AGTAGCGGTT 360 0 

GAAATGAACA AAGATGCGTT GCAACAACAA GTAAACTCAC AAGTTGATAA TAGTCATTAC 3 660 

ACAACAGCAT CAATTGCAGA ATACAATAAA CTTAAACAAC AAGCAGATAC TATTTTAAAT 3720 

GAAGATGCGA ATCATGTTAA AACTGCAAAT CGTGCATCTC AAGCGGATAT TGATGGTTTA 37B0 

GTAACTAAAT TACAAGCTGC ATTAATTGAT AATCAAGCAG CAATTGCTGA ATTAGATACT 384 0 

AAAGCTCAAG AAAAGGTTAC AGCAGCACAA CAAAGTAAAA AAGTTACGCA AGATGAAGTT 3 900 

GCAGCACTTG TAACTAAAAT TAACAATGAT AAAAATAATG CAATCGCAGA AATTAATAAA 3 960 

2S CAAACTACAG CACAAGGTGT CACAACTGAA AAAGATAATG GTATCGCAGT GTTAGAACAA 4 02 0 

GATGTGATTA CACCAACAGT TAAACCTCAA GCGAAACAAG ATATTATCCA AGCAGTTACA 4 0 80 

ACTCGTAAAC AACAAATTAA AAAGTCAAAT GCATCATTAC AAGATGAAAA AGATGTAGCA 414 0 

30 AATGATAAAA TTGGTAAAAT TGAAACAAAG GCAATTAAAG ATATTGATGC AGCAACAACA 4 20 0 

AATGCACAAG TAGAAGCCAT TAAAACAAAA GCAATCAATG ATATTAATCA AACTACACCT 4 260 

GCTACAACAG CTAAAGCAGC AGCTCTTGAA GAATTTGACG AAGTTGTTCA AGCACAAATT 4 3 20 

GATCAAGCAC CTTTAAATCC TGATACAACA AATGAAGAAG TAG CGG AAgC TATTGAACGT 4 3 80 

ATTAATGCAG CTAAAGTTTC TGGTGTTAAA GCAATTGAAG CGACAACGAC TGCACAAGAT 444 0 

TTAGAAAGAG TTAAAAACGA AG AAAT CTCA AAAATTGAAA ATATTACTGA CTCTACGCAA 4 5 00 

ACAAAAATGG ATGCCTATAA TGAAGTTAAA CAAGCTGCAA CAGCTAGAAA AGCTCAAAAT 4 56 0 

GCTACAGTTT CAAATGCAAC AAATGAAGAA GTAGCAGAAG CTGATGCAGC AGTAGATGCA 4 62 0 

^ GCTCAAAAGC AAGGTTTACA TG A CAT CC AA GTTGTTAAAT CAAAACAGGA AGTTGCTGAT 4 680 

ACAAAATCAA AAG T ATT AG A TAAAATCAAT GCAATTCAAA CACAAGCAAA AGTTAAACCT 4 74 0 

GCAGCTGATA CGGAAGTAGA AAA CG CAT AT AATACACGTA AACAAGAAAT TCAAAATAGC 4 800 

50 AATGCTTCAA CTACAGAAGA AAAACAAGCT GCATATACAG AATT AG A T A C TAAAAAGCAA 4 86 0 

GAAGCAAGAA CAAATCTTGA TGCTGCAAAT ACAAACAGTG ATGTAACAAC AG CTAAAGA C 4 92 0 
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GCGGAAATCG CTCAAAAAGC AAGTGAACGT AAAACAGCAA TTGAAGCAAT GAATGATTCG 5 04 0 

ACTACTGAAG AACAACAAGC AGCGAAAGAC AAAGTGGATC AAGCAGTAGT TACTGCAAAC 5100 

GCTGATATAG ATAATGCTGC AGCAAACAAT GATGTGGATA ATGCAAAAAC TACAAATGAA 5160 

GCTACAATCG CAGCCATTAC ACCTGATGCA AATGTTAAAC CAGCAGCAAA ACAAGCAATT 522 0 

GCAGATAAAG TACAAGCTCA AG AAACAG CA ATTGATGGAA ATAACGGCTC AACAACTGAA 5 28 0 

GAAAAAGCAG CTGCTAAACA ACAAGTTCAA ACTGAAAAAA CAACAGCTGA TGCCGCAATA 534 0 

GATGCAGCAC ATACAAATGC GGAAGTTGAA GCGGCTAAAA AAGCAGCAAT TGCTAAAATT 5400 

GAAGCGATTC AGCCAGCAAC AA CAA CT AAA GATAATGCGA AAGAAGCAAT TGCTACGAAA 54 60 

GCGAATGAAC GTAAAACAGC AATCGCTCAA ACGCAAGACA TTACTGCTGA AGAAATTGCA 5 52 0 

GCGGCTAATG CGGACGTAGA TAATGCTGTG ACACAAGCAA ATAGCAACAT TGAAGCTGCT 5 58 0 

AATAGTCAAA ATGATGTAGA CCAAGCGAAA ACGACAGGTG AAAATAG TAT TGATCAAGTA 56 4 0 

ACACCAACAG TTAATAAAAA AGCAACTGCA CGTAATGAAA TCACAGCAAT TTTAAATAAC 57 00 

AAATTGCAAG AGATTCAAGc tACGCCAGAT GCAACAGATG AAGAAAAACA AGCAGCTGAT 5 760 

GCTGAAGCAA ATACTGAAAA TGGTAAAGCA AATCAAGCCA TTTCAGCAGC AA CT ACT AAC 5 8 20 

GCACAAGTTG ATGAAGCTAA AGCAAATGCA GAAGCAGCGA 7TAATGCGGT AACACCAAAA 58 8 0 

GTTGTGAAGA AACAAGCGGC TAAAGATGAA ATTGATCAAT TACAAG CAAC GCAAACAAAT 5 94 0 

GTT AT CAAT A ATGATCAGAA CGCTACAACA GAAGAAAAAG AAGCAGCTAT TCAACAATTA 6 000 

GCAACAGCAG TTACAGA CGC GAAAAATAAT ATTACAGCTG CAACTGATGA TAATGGTGTA 6 060 

GATCAGGCGA AAGACGCTGG AAAGAATTCA ATTCAAAGCA CGCAACCAGC AACAGCGGTT 6120 

AAATCAAATG CTAAAAATGA TGTTGATCAA GCTGTGACAA CTCAAAATCA AGCAATTGAT 6180 

AATAGAACTG GTGCTACAAC TGAAGAGAAA AATGCAGCAA AAGATTTAGT TTTAAAAGCT 6 24 0 

AAAGAAAAAG CGTATCAAGA TATCTTAAAT GCACAAACAA CTAATGATGT TACGCAAATT 63 00 

AAAGATCAAG CAGTTGCTGA TATTCAAGGT ATTACTGCAG ATA CAA CAAT TAAAGATGTT 6 360 

GCGAAAGATG AATTAG CAAC AAAAGCAAAC GAAGAAAAAG CGCTTATTGC ACAAACTGCA 6420 

GATGCGACTA CTGAAGAAAA AGAACAAGCA AATCAACAAG TAGACGCACA ATTAACACAA 64 8 0 

GGTAATCAAA ATATTGAAAA TGCACAGTCA ATCGATGATG TAAACACTGC AAAAGATAAT 6 54 0 

GCAATTCAAG CAATTGACCC AATTCAAG CA TCAACAGATG TTAAAACGAA TGCAAGAGCG 6 60 0 

GAATTGCTAA CTGAAATGCA AAATAAAATA ACTGAAATAC TTAATAATAA TGAGACTACT 6 66 0 

AATGAAGAAA AAGGTAACGA TATTGGACCA GTTAGAGCAG CATATGAAGA AGGTTTAAAT 6 72 0 
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AAAGTTCAAC 


AACTTCATGC 


AAATCCTGTT 


AAGAAACCAG 


CAGGTAAAAA 


AGAATTAGAT 


6840 




CAAGCTGCAG 


CTGATAAGAA 


AACACAAATA 


GAACAAACAC 


CAAATGCATC 


ACAACAAGAA 


6900 


5 


ATTAAT3ATG 


CAAAACAAGA 


AGTTGATACT 


GAATTAAATC 


AAGCGAAAAC 


AAATGTCGAT 


6960 




CAATCATCAA 


CAAATGAATA 


TGTTGATAAT 


GCAGTTAAAG 


AAGGAAAAGC 


TAAAATTAAT 


7020 




GCA3TTAAAA 


CATTTAGTGA 


GTACAAAAAA 


GATGCTTTAG 


CTAAAATTGA 


AG ATG CAT A T 


7080 


W 


AATGCTAAAG 


TAAACGAAGC 


GGATAACTCT 


AACGCATCGA 


CTTCAAGTGA 


AATTGCTGAA 


7140 




GCGAAACAAA 


AACTTGCTGA 


ATTAAAACAA 


ACTGCGGATC 


AAAATGTTAA 


TCAAG CT ACT 


7200 


IS 


TCTAAAGATG 


ACATTGAAGT 


TCAAATTCAT 


AATGACTTAG 


ATAATATTAA 


CGATTACACA 


7260 


ATTCCAACAG 


GTAAAAAAGA 


ATCAGCTACA 


ACAGATTTAT 


ATGCTTATGC 


AGATCAGAAG 


7320 




AAAAATAATA 


TTTCAGCTGA 


CACTAATGCA 


ACACAAGATG 


AAAAGCAACA 


AG CAATT AAG 


7380 


20 


CAAGTTGACC 


AAAATGTTCA 


AACTGCATTA 


GAAAGCATTA 


ATAATGGTGT 


GGATAATGGT 


7440 




GACGTTGATG 


ATGCATTAAC 


ACAAGGTAAA 


GCAGCAATTG 


ATGCTATTCA 


AGTAG ATG CT 


7500 




ACTGTTAAAC 


CTAAAGCGAA 


CCAAGCTATT 


GAAGTTAAAG 


CAGAAGATAC 


GAAAGAATCT 


7560 


25 


ATTGATCAAA 


GTGACCAGTT 


AACTGCTGAA 


GAAAAAACTG 


AAGCATTAGC 


AATGATTAAA 


7620 




CAAATTACAG 


ATCAAGCTAA 


ACAAGGTATT 


ACTGATGCAA 


CAACAACTGC 


TGAAGTTGAA 


7680 




AAAGCGAAAg 


cTCaAGGACT 


TGAAGCATTT 


GATAACATTC 


AAATCGACTC 


AACAGAAAAA 


7740 


30 


CAAAAAGCTA 


TCGAAGAATT 


AGAAACTGCA 


CTAGACCAGA 


TTGAAGCAGG 


TGTAAATGTC 


7800 




AACGCTGATG 


CTACAACTGA 


AGAAAAAGAA 


GCGTTTACGA 


ATGCTTTAGA 


AGACATTTTA 


7860 




TCAAAAGCAA 


CTGaAGATAT 


TTCTGATCAA 


ACTACAAATG 


CAGAAATCGC 


TACTGTCAAA 


7920 


35 


AATAGTGCGC 


TTGAACAACT 


TAAAGCACAA 


CGTATTAATC 


CTGAAGTTAA 


GAAAAATGCT 


7980 




TTGGAAGCAA 


TCAGAGAAGT 


GGTTAACAAG 


CAAATAGGAA 


CAATTAAAAA 


TGCAGATGCA 


8040 


40 


GATGCATCGG 


CGGAAAGAnA 


TTGCACGTAC 


GGGATTTAGG 


TAGATATTTT 


GGACCGATTT 


81C0 


GCTGGATAAA 


TTTAGGGTnA 


AACCCCAACC 


AATGCCGAAG 


TTGCCTGAAT 


TACCA 


8155 




(2) INFORMATION FOR SS 


Q ID NO: 64 











(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1630 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 64: 
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CTGTTTTATT TGCAGCACCC ATACTGGAAA TCACTTTAAT CCCTCGGTCA AGACACTCTT 120 

TCATTAAGTG TACTTTGTAC ATTATTGTAT CACTTGCATC TACAAAATAA TCTATATCGT 180 

AGTTATCGAA AATTTCTTCA TATGTCTCTT CTGTATAAAA CATATGTAAG GGCGTGACTT 24 0 

TACAATCTGG ATTAATTAAT TTAATACGTT CTTCCATCAA AGAAACTTTA CTTTGTCCTA 300 

CCGTTGTAGT TAAAGCGTGT AATTGTCTGT TTACATTTGT AATATCAACA T CAT CTTT AT 36 0 

CTATTAATAT AATATGACCA ATATTCGTTC TTGCTAATGC TTCAGCAGCA AATGAACCAA 420 

CACCTCCAAC GCCAAGTATG ACAACAGTTT GTTGCTTCAA TAAATCTAAA CCTTGTTGTC 4 80 

CAATCGCTAG TTCATTTCTT GAAAATTGAT GTTTCATTAT TTTACCTCTT TCACTGATTT 54 0 

ATACATAAGT ACATAGTAAC TTAAAATTTT ATATTTAGCA TTATCACTTT GATTATTTTC 600 

CCAAAATTCA ACGAGGAAAC ATTTATTAAA CGCTATAAAA CCCAACTAAT TCTTTATTAA 66 0 

AAACTTAAAG AAACG CAT AA AAATACGCAA GACAAAGTCT TGCGTATCGA TAGAGTCCGT 72 0 

ATTGCCGTAG TTATAATAGC TTGATCATTC GGCCTGTTAT AT ACAGG TGG GTGCCCTGTT 78 0 

TCTTGTTTTG TACGTCCTTC ATATAAGGCG TGTACGCTGC AAGAAAACCC ATTGGGCTCC 84 0 

CTTGATCAAA GAGTGTTAGG CCCAAATTAA AAAGCAAACT TACGAACAAC TCAGATGACT 900 

ATCTTATGAT GTTATATTAC CACATAATTA AAATTAATGA AATTATAACA AACCAAAGTT 96 0 

TATTGATTTT TTAAAATTTA GTGACGAATT CGCAAAGAAA GTTCTTCTAA TTGTTTATCA 102 0 

GAAACTTCAC TAGGCGCATT CGTTAATAAA CATGTAGCAG ATGCTGTTTT AGGGAATGCG 1080 

ATTGTATCTC TCAAGTTTGT TCTATTAGTC AATAACATGA CTAATCGGTC tAATCCTAAT 114 0 

GCAATACCGC CATGTGGTGG TGCACCATAT TTAAATGCAT CTAGTaAGAA GCCGAACTGT 12 0 0 

TCCTgTGCTT GTTCTTTAGT AAATC CAAGA ACTTCGAACA TTTTTTCTTG TAACTCACCA 1260 

TCATGAATTC TGATTGAACC GCCACCTAAT TCATAACCAT TTAATACTAT GTCATAAGCA 13 2 0 

TTTGCCTCAG CTTCtTCTGG CGCAGTGCCA AGCTTAGCAA TATCAGCTTC TTTTGGAGAT 13 80 

GTAAATGGAT GATGTGCTGC AACGTAACGT TTCGCATCTT CAT CAT ATT C TAATAATGGC 144 0 

CAATCTGTCA CCCATAAGAA GTTTAATTTT GTTTCATCGA TTAAACCTAA TTCTTTAGCT 15 00 

AATTTGACAC GTAATGCACC TAAACTTTGT GCAACGACAT TTGGTttGTC TGCAACAAAC 1560 

ATTACTAAGT CACCAGCTTC AGCACCAGTT AATGTAAGTA ATGTTTCAAC ATTTTCTGTT 162 0 

c AAAGAAACG 16 3 0 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 732 base pairs 



487 



EP0 786 519 A2 



<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

CAATTGGACA TCTTGTA7GA AAAGGACAAC CTTGCGGCGG ATTACTTGGC GAAGGTAATT 60 

10 CTCCTTTTAA TATAATTCTA TTGTTATTAT GTTTATCAAT TTGTGGTATT GATGAAATCA 12 0 

ACGCTTTTGT ATATGGATGT TTGGGATTTT CATAAATTTC TTTATCAGAT GCGATTTCAA 180 

CTATATGACC TAAATACATA ACTCCAATGA CATCACTTAT ATGTTTTACT ACACTTAAAT 24 0 

15 CATGTGCGAT AAATAAATAG CTTAAGTTAA ATTGTTCTTG TAAATCTTTT AATAAATTCA 3 00 

GTACTTGAGA TTGAACAGAT ACATCTAATG CACTTACAGG CTCATCAGCA ACAATTAAAC 3 60 

TCGGACGCAA AGCCAATGCT CTTGCAATTC CCACTCTTTG TCTCTGTCCA CCTGAAAATT 420 

20 

CATGTGCATA TTtATAATAT GCATCTTCAC TTAGGCCAAC ACATTTTAAT AAATATAGTA 4 80 

CTTCTTTTTT TATTTCTTCT TTTGGCAATT TTTTATAATT TAAAATAGGT TCTGAAATGA 54 0 

TATCTCCAAC CATTTGCATC GGATTCAATG ATGCATACGG ATCTTGAAAT ATCATCTGAT 6 00 

2S 

ATTGTTGTCG TGATTTTCTG AGTTTTTTAC CTTGTAATCT TGTTATATCT TCACCATTAA 66 0 

CAATTATTGA GCCTGAAGTT GCATCTTCAA GCCTGATAAT CACTTTACCT AACGTTGACT 72 0 

3Q TACCACAACC CG 73 2 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5838 base pairs 
35 (3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

40 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
AATATATTCA TArGTTTCAT CAACAATATT AGCTGCTTTT TGAATTAAAG CAATTTCGTC 6 0 

AGCATCTTTG ACGTCTCTAA TTTTATCTAC AGTATTAGAA ATGCTTATTA ATGATATACG 12 0 

45 

GCTTTTATTT AATTCAAGGT ATGTATCATA ACTTACATGA TGCCCCTCAA AACCTACATT 180 
TTCAAAATTT TCTTGGTGTA GCAATTCTTT AATCTCACCA ATAATAGTAG ATTTACGATT 24 0 

50 AATAATTTCA TAATTTGGCG CCTGCTTAGT TGCTTGATCA ATATATCTAA AGTCTGTTAT 30 0 

CAAATATTGT TTATCTTTAG ATATGATAAG TGCTCCACTG GTACCAGTAA AACCTGATAA 360 
ATATCTTCTA TTGTAATCCG AAAGAATGaT AATCGCATCT AAATGTTTTT GTTCTAAAAT 42 0 
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CAACTTTATA CATTAAAATA ATATCATAAT AAGGATAAAA AATAATAGAT ATTGATTTTA 54 0 

GGGAGATAGT AATGAAAAAA TTGGTTTCAA TTGTTGGCGC AACATTATTG TTAGCTGGAT 6 00 

GTGGATCACA AAATTTAGCA CCATTAGAAG AnAAAACAAC AGATTTAAGA GAAGATAATC 660 

ATCAACTCAA ACTAGATATT CAAGAACTTA ATCAACAAAT TAGTGATTCT AAATCTAAAA 720 

TTAAAGGGCT TGAAAAGGAT AAAGAAAACA GTAAAAAAAC TGCATCTAAT AATACGAAAA 7 80 

TTAAATTGAT GAATGTTACA TCAACATACT ACGACAAAGT TGCTAAAGCT TTGAAATCCT 84 0 

ATAACGATAT TGAGAAAGAT GTAAGTAAAA ACAAAGGCGA TAAGAATGTT CAAT CG AAA T 900 

TAAATCAAAT TTCTAATGAT ATTCAAAGTG CTCACACTTC ATACAAAGAT GCTATCGATG 96 0 

GTTTATCACT TAGTGATGAT GATAAAAAAA CGTCTAAAAA TATCGATAAA TTAAACTCTG 102 0 

ATTTGAATCA TGCATTTGAT GATATTAAAA ATGGCTATCA AAATAAAGAT AAAAAACAAC 108 0 

TTACAAAAGG ACAACAAGCG TTGTCAAAAT TAAACTTAAA TGCAAAATCA TGATAGGAGT 114 0 

CTTTTAATGC GTAATATAAT ATTTTATCTT GTACTTATTA TTGCTGCGAT TGGATTAGTA 1200 

ATGAATCTAG ATGCCTTTAT TTTTTCAATC GTCAGAATGT TAATCAGCTT TGcgTAaTAG 1260 

CTGGTATTAT TTATC7GATT TATTATTTCT TCATCTTAAC TGAAGACCAA CGCAAATATC 13 20 

GCAAAGCAAT GCgTrAaGTA TAAAAGAAAT CAAAGAAGAA AATAGATAAA AAAACGGAAG 13 80 

CACTTGTAGG TAAAATAGTC TACGTGCTTC CATTTTTTAT TCTAAAAACT ACTTTCTAAA 14 4 0 

CATCCATTCA TCTGAACGAT ATTTTTCAGT TAATTCTTCC ACTTCTGCCA ATTG AG CTTC 15 00 

TGtTAATTCA AGTGGCTTTA ATTCTATATT TAAACCTTTC TTAAAACCTT TCTCGAAAGC 156 0 

TTCTTCCATT TGACTAATAG TAATGTGTTC ATCTGAAATA TCATTGATGG CAACTGCTTT 16 20 

TTCAACGAAT GCCTCTTTCA TTTTTAATTT TAATCTTTCA TTTTTATAAA TrAACATATC 16 80 

AAACAGTTCA TCAATATCAA TATCTTGTAA AATCGAACCG TGTTGGAGGA TTACGCCCTT 174 0 

TTGTCTCGTT TGAGCACTCC CAGCAATCTT ACGGCCTTCA ACAACTAGCT CATACCAACT 18 00 

TGGTGCATCA AAACACACTG AACTTCGAGG TTGTTTTAAT TTTTGACGCT CTTCAGGCGT 18 60 

TTTAGGTACC GCAAAATAAG TATCAAATCC TAAGTTTTTA AATC CTTCT A ATAATCCTTG 1920 

45 TGAAATCACT CTGTACGCTT CTGTAACTGT AGAAGGCATA TTCGGATGCG ATTCAGGCAC 19 80 

AATCACACTG TAAGTTAACT CTTTATCATG TAGCACCCCA CGGCCACCAG TTTGACGCCT 2 04 0 

TACGAGACCA AAACCTTTCT CTTTAAC CTT ATCAATATCA ATTTCTTTTT GTAGCCTTTG 2 10C 

GAAATACCCT ATTGATAATG TTGCAGGATT CCATGTGTAA AAACGTATAA CTGGATCAAT 2160 

TTCACCTCTA GAGACAAAAT TTAATAACGC TTCATCCATT GCCATATTAT AATATGGGTC 2 22 0 
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AAATGTATAA TATTTGATTC GCTAATTAAT CAATTTAACT AAATGAATAA TAATTGCAAT 2 34 0 

TCTTTAGTGA AATATTTTGA TAATTTGACC TAACAGTCTT ATAATTATAT TATCGTTTAA 24 CO 

5 TTAGGGAGGA TGCAAGATGA GTGCTAGTTT GTACATCGCA ATAATTTTAG TTATAGCAAT 2460 

TATTGCTTAT ATGATTGTTC AACAAATTCT TAACAAGCGA GCTGTTAAAG AATTAGATCA 2 52 0 

AAATGAATTC CATAATGGGA TTAGAAAAGC TCAAGTCATC GATGTTAGAG AGAAAG TTG A 2 580 

w 

CTATGACTAC GGTCACATTA ATGGGTCTCG CAATATTCCT ATGACAATGT TCAGGCAACG 264 0 

ATTCCAAGGA TTAAGAAAAG ATCAACCGGT ATACTTATGT GATGCCAATG GGATTGCTAG 270 0 

js CTATAGAGCC GCTCGTATTT TGAAAAAGAA TGGATATACA GATATCTATA TGTTAAAAGG 276 0 

CGGCTATAAA AAATGGACTG GAAAAATAAA GTCTAAAAAA TAGTTTTTGT AAATTTAATA 282 0 

TACGATTTAA TAAAATCTGA GTGTTAATTG ATCATCAATA ACAATACTCA GATTTTAATT 28 80 

20 TTTTAACAAA GTCTGTTACT ATATTTCTCT AGCTTCACTG ATCATTAAAC TTAGTTTCAG 294 0 

CATAATAAAG AAAGTTCAGC TCATTTTCAA TACGATTCAA TTACCGCAAT CTAAAAAATG 3000 

AAAAGACAAT TTCTATGAAA GAATAATACC AAACCCTAAG AGTTATTACT TCGGTTTAGT 3060 

25 TTTCTTGTTT AAATAGAAAT TGTCTTTTTC AATTGATTTT GAAACCATTA TCCTTAAATC 3120 

TTCATACAAA GTTAGAATAA TAATTCTCGG AATATGTGTT TAATACTTTA TTTTTCCTGT 3180 

TTAAGATTTT CAAACTTTAA TATTGGTTTA CGAGCAGCTG TAGCTTCGTC TAATCGATCA 3 24 0 

30 

ATCACAGTTG TATGTGGTGC TTCTAGCacT TTATCAGGAT CATTTTTAGC TTCTTCAGCA 33 00 

ATACTAATTA ATGTATCGAT AAAATAATCA AGTGTTTCTT TAGACTCTGT CTCAGTCGGT 33 6 0 

TCAATCATCA TACCTTCTTC AACATTTAAT GGG AAGTATA TTGTTGGTGG ATGTACACCG 34 20 

35 

AAATCTAATA ATCGCTTAGC CATGTCTAAA GTACGTACAC CAAATTCTTT TTGACGCACA 34 8 0 

CCACTTAACA CAAACTCGTG TTTACAATAT TGTTTATAAG GTATTTCAAA GTGTTTAGAT 3 54 0 

40 AAACGTGCTT TAATATAATT CGCATTAAGA ACCGCTGCTT CAGAAACCTC TTTAAGTCCA 36 00 

GTTGCTCCCA TAGTTCGAAT ATACGTATAA GCTCTTAAGT AAATACCAAA GTTACCATAA 3 660 

AATGGTTTTA CACGTCCGAT AGAATTTTTA ATGTCATTAT CATATTTAAA TTTGTCGCCA 3720 

45 TCTTTAATAA CCATTGGCTT TGGTAAGTAA CTTGCTAGTT CTTTTACTAC AC CGACTGGA 37 80 

CCTGAACCAG GACCGCCACC ACCATGTGGA CCAGTAAATG TTTTATG CAA GTTTAAATGA 3 840 

ACAGCATCAA ATCCCATATC TCCTGGGCGA ACTTTGTCCA TAATAGCGTT TAAATTCGCA 3 9 00 

50 

CCATCATAAT ATAATAGACC ACCAGCATTA TGGACGATTT CACGGATTTC CATAATATTT 3 96 0 

TTTTCGAAAA TACCTAAAGT GTTTGGATTA GTTAACATAA TAGCTGCTGT ATTTTCATTT 4 02 0 



490 



EP 0 786 519 A2 

GATTTAAATC CTGCAAATGa AGCTGAGGCT GGaTTCGTAC CATGCGCAGA ATCTGGcACA 414 0 

ATGACTTCAT CACGATGACC TTCACCATTA TTCTCATGGT AAGCTTTAAA TATCATCAAT 42 00 

S GCAGTCCATT CACCATGTGC GCCAGCAGCT GGTTGTAATG TCACCTCATC CATACCAGTA 4 2 60 

ATTTCTTTTA ATTCTTCTTG CAAACTATAA ATAATTTCTA ATGAACCTTG AACTTGATCT 4 3 20 

TCATCTTGTA ATGGATGTGA TTCACTAAAT CCTGG TATTC TAGCAACCTT TTCATTAATT 4 33 0 

w 

TTAGGGTTAT ACTTCATCGT ACATGAACCC AATGGATAAA ATCCGTTGTC TACACCGAAA 4 44 0 

TTTTTATTTG AAAGTTCAGT ATAATGACGT ACTAAGTCTA GTTCAGCAAC TTCAGGAAAC 4 50 0 

TCCGCTTTGT TTTTACGAAT AAATTTATCA TCTAACAATG ACTCAACAGA ATTTGTTTTA 4 56 0 

IS 

ATATCACTTT TTGGTAATGA ATATGCATAT CTGCCTTCAC GAGATCTTTC AAAAATTAAT 4 62 0 

GGACTTGATT TACTAGTCAT TTAACTCACC AGCCTTTTCT ACAAATGTAT CGATTTCATC 4680 

2Q TTTTGTTCTT AATTCAGTTA CAGCTATTAA CATGTGATTT TTAAAGTCGT CTGAAACAAC 4 74 0 

ACCTAAATCA AAACCACCGA TAATATTGTA CTTCACTAAT TCCTCGTTAA CTTGTTGAAT 4 800 

TGGTTTGTCA AATTTGACTA CAAACTCATT GmnAAGnTGT AC CAT CTAAT ACTTCAAAAC 4 860 

25 CTTTTTTAAT AAATTGTTGT TTAGCATAGT TAGCATGTTC TATATTTTGA ACTGCAATAT 4 92 0 

CATAGATACC TTGTTTACCA AG TG CTGACA TTGCAATTGA TGaCGcTAAA GCATTTAATG 4 98 0 

CTTGGTTAGA ACAAATATTA GATGTCGCTT TATCGCGTCG AATATGTTGT TCACGTGCTT 5 04 0 

GTAATGTTAA TACAAAGCCA CGATTACCTT CATCATCTTG TGTTTGACCG ACTAATCTAC 5100 

CTGGCACTTT ACGCATTAAC TTTTTCGTCG TTGCAAAATA TCCACAATGT GGCCCACCGA 516 0 

ATTGAGCAGG AATTCCGAAT GGCTGAGTAT CACCTACAAC AATATCTGCA CCAAATGAAC 5220 

35 

CTGGAGGTGT AAGTAATCCC AATGCTAATG GATTTGCATA TACGATAAAT AATGCTTTTT 52 80 

TATCTTCAAT AAAGCTATGA ATCTTTTCAA GATCTTCAAT TGAACCGTAA AAGTTTGGAT 5 34 0 

ATTGTACTGC AACAGCTGCT GTTTCATCAT CCACTGCTGC TTCTAATTTT TTCAAATCTG 54 00 

40 

TAACAGTGCC ATCTAAATCG ATTTCCACTA CTTCGAATTC CTTACGCGTC TT AG CAT AAG 54 6 0 

TATGAAGTAC TTGTAATGCT TGATAATGTA AACCTTTTGA GACTACAATT TTATTTTTCT 5 520 

45 TTGTTTGACT AAATGCTAAG ATACATGCTT CAGCAAAGCT AGT CATC CCA T CAT A CAT AG 55 80 

AAGAATTTGC T AC AT C CAT A TCTGTTAATT CACAAATTAA AGTTTGGAAC TCAAAAATGG 5 64 0 

CTTGTAATTC ACCTTGAGAA ATTTCCGGTT GATATGGCGT ATATGCTGTG TAAAATTCTG 5700 

50 ATCTTGAAAT CATAGCATCC ACAACTGATG GCGCGTAATG ATCATAAACA CCAGCACCCA 5 76 0 

rAAATGATGT ATGCGTTTCT TTAGTGATAT tCTTGCTkGC AAPGGGGATT TAAACnTCTA 5 82 0 
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(2) INFORMATION FOR SEQ ID NO: 67; 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18355 base pairs 
(3) TYPE: nucleic acid 
(CJ STRANDEDNSSS: double 
(D) TOPOLOGY: linear 



w 



15 



20 



25 



30 



35 



40 



45 



50 



(xi ) 


SEQUENCE DESCRIPTION: 


SEQ ID NO; 


67 : 






ATnATAATTG 


GCTTTGCTAA 


TAATTACTTC 


CCTGAATTAC 


aAGTATTAGC 


AAACGAAATA 


60 


AAATCTGATA 


TGGCTAGTTC 


ATTAAAACAA 


TGATATTTTT 


ATTTAAATTT 


TTaAAGCTTT 


120 


GTACGAAATT 


GTACAAAGCT 


TTTTTGGTGC 


GTATTGTATG 


GGCAACAACT 


TGACGATGAA 


180 


AATCCGTTAC 


AGGATTGGTA 


ATAGGAAATG 


TTAGCGAAAG 


ACAAGGGTAT 


CCATTGTAGA 


240 


TTAACAAAAG 


GACGTTTCCA 


CAAGTGTGGG 


TTATTCTCAC 


TAAAGCAATA 


CGCAGAGACA 


300 


ACTTACGTAA 


AATTTTGAAC 


TGACTAGAAC 


GGAACTTCTA 


CTCAATTATT 


GATAAAAATT 


360 


TTCAAAAAGA 


CTTGAATGTG 


CTGAGAATAC 


GAAGTTTATG 


GAAGGATTAT 


CAAAATATAA 


420 


ATGTGCATTC 


ATTTACAACC 


TTTATTGACA 


ATGATTCTCA 


ACTAATATAG 


TATATAATCA 


480 


AATCGTAATA 


GTTACGATTT 


GTTTTCTGCA 


ACTTTTTTGA 


AGTTTTAGTT 


GAGGTGAAAA 


540 


CAATAAAAGC 


ATCTAAGTGA 


ATGTAGTTAA 


CGGACAACTG 


CATTCGCTTG 


TAGAGCCACA 


600 


AGAAGCAACT 


TTAAATAAGG 


TTTACGGTTG 


CATTTTGATA 


CAACAACCGA 


TTACTAAGTC 


660 


ATGCTTTCCA 


CTTTGCGGGT 


TAGCATGACT 


TACCTAATAG 


ATAGAGCTAT 


TAGGTTCAGC 


720 


TTCTAAAAAA 


TTACAGTTTT 


AGAGGAAT A C 


AGTTGcTTGc 


tTCGCAACAA 


CTGCATAAGA 


780 


GCCATGGTTT 


TCGCTTTTGC 


GAATTAGCAT 


GACTTACCTA 


CTAGATAGAG 


CTATTAGGTT 


84 0 


CATCTTCTAA 


AAAATTACAG 


GTTTAGAGGA 


ATACAGTTGT 


TTGcTTCGCA 


ACAACTGCAT 


900 


AAGAGCCTCT 


AGTAATTAAA 


ATTACAGAGG 


CTC7AAAAAT 


ACATCTAAAG 


GAGTGTCGTA 


960 


TGAATCGGCA 


GGTTATAGAA 


TTTTCTAAGT 


ATAATCCTTC 


GGGGAATATG 


ACGATACTTG 


1020 


TTCATTCAAA 


ACATGATGCT 


AGTGAATATG 


CATCTATCGC 


CAATCAGTTG 


ATGGCCGCAA 


1080 


CACATGTATG 


CTGTGAACAG 


GTAGGCTTTA 


TAGrATCAAC 


ACAAAATGAT 


GATGGTAATG 


1140 


ATTTTCACTT 


AGTTATGAGC 


GGTAATGAAT 


TTTGCGGTAA 


TGCGACGATG 


TCATATATAC 


1200 


ATCATTTGCA 


GGAAAGTCAT 


TTGCTTAAAG 


ACCAACAGTT 


TAAGGTGAAG 


GTGTCTGGCT 


1260 


GTTCGGATTT 


AGTGCAATGC 


GCAATTCATG 


ATTGCCAATA 


CTATGAAGTT 


CAAATG CCAC 


1320 


AAGCCCATCG 


TGTTGTGCCA 


ACAACAATTA 


ATATGGGTAA 


TCATTCATGG 


AAAGCAATAG 


1380 
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TTCAACATTT GGTTGAAGCG TTTGT3CGTG AgcAACAATG GAGTCACAAA TATAAAACAG 15 00 

TAGGTATGAT GCTTTTTGAT GAACAACGTC AATTTTTACA GCCATTAATC TATATACCAG 156 0 

AAATTCAAAG TTTAATTTGG GAAAATAGCT GTGGTTCTGG TACAgcATCA ATTGGGGTTT 162 0 

TTAATAATTA TCAACGTAAT GACGCATGCA AAGATTTTAC AGTACATCAG CCAGGGGGCA 168 0 

GTATTTTAGT GACATCAAAG CGATGTCATC AATTGGGATA TCAAACTTCA ATTAAAGGAC 174 0 

AGCTTACAAC TGTAGCTACA GGaAAAGCAT ATATAGAATA AGGAGCCTAC AATGAATAAC 1800 

TTTAATAATG AAATCAAATT GATATTACAA CAATATTTAG AAAAGTTTGA AGCGCATTAC 186 0 

GAGCGTGTAT TACAAGACGA TCAATATATC GAAGCATTAG AAACATTGAT GGATGACTAT 1920 

AGTGAATTTA TTTTAAATCC TATTTATGAA CAACAATTTA ATGCTTGGCG TGACGTTGAA 1980 

GAAAAAGCAC AATTaATAAA ATCACTGCAA TATATTACAG CGCAGTGTGT TAAACAAGTG 204 0 

GAAGTCATTA GAGCGAGACG TCTATTAGAC GGACAGGCGT CTACCACAGG TTACTTTGAC 2100 

AATATAGAAC ATTGTATTGA TGAAGAGTTT GGACAATGTA GTATAGCTAG CAATGACAAA 2160 

TTATTGTTAG TTGGTTCAGG TGCATATCCA ATGACGTTAA TTCAAGTAGC AAAAGAAACA 2 2 20 

GGTGCTTCAG TTATCGGTAT TGATATTGAT CCACAAGCCG TTGACCTAGG GCGCAGAATC 2 2 80 

GTTAACGTCT TAGCACCAAA TGAAGATATA A CAATT A CGG ATCAAAAGGT ATCTGAACTT 2 34 0 

AAAGAT AT C A AAGATGTGAC GCATATCATA TTCAGCTCGA CAATTCCTTT AAAGTACAGC 2 4 00 

ATTTTAGAAG AATTATATGA TTTAACAAAT GAAAATGTCG TAGTTGCAAT GCGCTTTGGT 24 6 0 

GATGGCATCA AAGCAATATT TAATTATCCG TCACAAGAAA CAGCGGAAGA TAAGTGGCAA 2 52 0 

TGTGTGAATA AACATATGAG ACCACAGCAA ATTTTTGATA TAGCACTTTA TAAAAAAGCA 2 58 0 

GCTATAAAGG TAGGTATTAC GGATGTCTAA ATTATTAATG ATAGGCACTG GTCCgGTCGC 2 64 0 

AAT CCAATT A GCGAATATTT GCTATTTAAA A T CAGATT AT GAGATTGATA TGGTTGGACG 2 70 0 

TG CCTCAACA TCAGAAAAAT CAAAACGCTT ATATCAAGCG TATAAAAAAG AGAAACAATT 2 76 0 

TGAAGTCAAA ATACAAAACG AGGCGCATCA ACATCTGGAA GGTAAGTTTG AAATTAATCG 2 82 0 

TTTGTATAAA GATGTTAAAA ACGTTAAGGG TGAATACGAA ACGGTTGTCA TGGCATGCAC 2 880 

AGCAGATGCT TATTATGACA CACTACAGCA ATTGTCGTTA GAAACTTTGC AAAGTGTCAA 2 94 0 

ACATGTCATT TTAATATCAC CGACATTTGG TTCGCAAATG ATTGTCGAAC AATTTATGTC 3 000 

TAAATTTAAT AAAGATATCG AAGTGATTTC ATTGTCAACT TATCTTGGCG ATACACGTAT 3 06 0 

TGTTGATAAA GAAGCGCCTA ATCATGTGTT GACAACAGGT GTAAAAAAGA AATTGTACAT 312 0 

GGGATCGACA CATTCAAACT CAACAATGTG TCAACGAATC TCTGCTTTAG CTG AG CAATT 3180 
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TTATGTGCAC CCACCACTAT TTATGAATGA CTTTTCATTG AAAG CCATTT TCGAAGGAAC 3 3 00 

AGATGTACC3 GTTTATGTGT ATAAGTTATT T C CTGAAGG A CCGATAACGA TGACACTAAT 3 3 60 

CCGTGAAATG CGTTTAATGT GGAAGGAAAT GATGGTTATT TTACAAGCAT TTAGAGTGCC 34 20 

GTCAGTCAAC CTGCTTCAAT TTATGGTGAA GGAAAATTAT CCAGTACGTC CTGAAACTTT 34 90 

GGATGAAGGT GATATTGAGC ATTTCGAAAT CTTGCCAGAT ATCTTACAAG AATATCTGCT 3 54 0 

TTATGTAAGA TATACCGCAA TCCTCATTGA TCCATTTTCA CAGCCAGACG AAAACGGACA 3 6 00 

TTACTTTGAT TTTTCAGCTG TACCATTTAA GCAAGTCTAT AAAAATGAAC AGGATGTTGT 3 6 60 

TCAAATTCCA AGAATGCCAA GTGAAGATTA TTACAGAACG GCGATGATTC AGCATATTGG 3 72 0 

GAAAATGCTA GGTATCAAAA CGC CAATGAT TGATCAGTTC CTAACTCGCT ATGAAGCAAG 3 7 80 

TTGCCAGGCG TACAAGGATA TGCATCAAGA TCAACACTTA TCTTCTCAAT TTAATACAAA 3 84 0 

TCTATTTGAA GGAGATAAAG CACTCGTCAC AAAATTTTTG GAAATCAATA GAACGCTTTC 3 900 

ATAATAAGGG TTTGAAGTTT TATAATAGAA AAAAATTATT GAATTATGTT TGACATTTAC 3 96 0 

ATAAAAATAA GCAAATAATT GAGAAAAATA ATCATTACGA TTTGATTAAG TAATGCAACT 4 02 0 

TATCAATTTA GAAAGAGGAA AAGCAAATGA GAAAACTAAC TAAAATGAGT GCAATGTTAC 4 080 

TTGCATCAGG GCTAATTTTA ACTGGTTGTG GCGGTAATAA AGGTTTAGAG GAGAAAAAAG 414 0 

AAAACAAGCA ATTAACGTAT ACGACGGTTA AAGATATCGG TGATATGAAT CCGCATGTTT 4 200 

ACGGTGGATC AATGTCTGCT GAAAGTATGA TATACGAGCC GCTTGTACGT AACACGAAAG 42 6 0 

ATGGTATTAA GCCTTTACTA GCTAAAAAGT GGGATGTGTC TGAAGATGGG AAGACATACA 4320 

CGTT CCATTT GAGAGATGAC GTTAAATTCC ATGATGGTAC GCCATTTGca TGctGACGCA 4 3 80 

GTTAAGAAAA ATATTGACGC AgTTCAAGAA AACAAAAAAT TGCATTCTTG GTTAAAGATT 44 4 0 

TCGACATTAA TTGACAATGT TAAAGTTAAA GATAAGTACA CGGTTGAATT GAATTTGAAA 4 5 00 

GAAGCATATC AACCTGCATT GGCTGAATTA GCGATGCCTC GTCCATATGT ATTTGTGTCT 4 560 

CCAAAAGACT TTaAAAACGG TACAAcAAAA GATGGCGTTA AAAAGTTCGA TGGTACTGGT 4 6 20 

CCATTT AAA T TAGGTGAACA CAAAAAAGAT GAGTCTGCAG ACTTTAACAA AAATGATCAA 4 6 80 

TACTGGGGCG AAAAGTCTAA ACTTAACAAA GTACAAGCAA AAGTAATGCC TGCTGGTGAA 4 74 0 

ACAGCATTCC TATCAATGAA AAAAGGTGAA ACGAACTTTG CCTTCACAGA TGATAGAGGT 4 8 00 

AC AG AT AG CT TAGACAAAGA CTCTTTAAAA CAATTGAAAG ATA CAGGTG A CTATCAAGTT 4 8 60 

AAGCGTAGTC AACCTATGAA TACGAAAATG TTAGTTGTCA ATTCTGGTAA AAAAGATAAC 4 92 0 

GCTGTGAGTG ACAAAACAGT CAGACAAGCG ATTGGT CAT A TGGTAAACAG AGATAAAATT 4 98 0 
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AC AG AC ATT A ATTTCGATAT GCCAACACGT AAGTATGACC TTAAAAAAGC AGAATCATTA 5100 

TTAGATGAAG CTGGTTGGAA GAAAGGTAAA GACAGCGATG TTCGTCAAAA AGATGGTAAA 516 0 

5 AACCTTGAAA TGGCAATGTA CTATGACAAA GGTTCTTCAA GTCAAAAAGA ACAAGCAGAA 52 2 0 

TACTTACAAG CAGAATTTAA GAAAATGGGT ATTAAGTTAA ACATCAATGG CGAAACATCA 52 8 0 

GATAAAATTG C7GAACGTCG TACTTCTGGT GATTATGACT TAATGTTCAA CCAAACTTGG 534 0 

w 

GGATTATTGT ACGATCCACA AAGTACTATT GCAGCATTTA AAGAGAAAAA TGGTTATGAA 54 0 0 

AGTGCAACAT CAGGCATTGA GAACAAAGAT AAAATATACA ACAGCATTGA TGACGCATTT 54 6 0 

AAAATCCAAA ACGGTAAAGA GCGTTCAGAC GCTTATAAAA ACATTTTGAA ACAAATTGAT 552 0 

ts 

GATGAAGGTA TCTTTATCCC TATTTCACAC GGTAGTATGA CAGTTGTTGC ACCaAAAGAT 558 0 

TTAGAAAAAG TATCATTCAC ACAATCACAG TATGAATTAC CATTCAATGA AATGCAGTAT 564 0 

2Q AAATAAAGGA GCAATTAGAT GTTCAAATTT ATCTTAAAAC GTATTGCGCT CATGTTTCCA 5 700 

TTGATGATTG TAGTAAGTTT TATGACATTT CTATTGACGT ATATTACAAA TGAAAATCCA 576 0 

GCTGTGACAA TTTTACATGC ACAAGGGACG CCAAATGTAA CACCAGAGTT GATTGCAGAA 582 0 

25 ACGAATGAGA AGTACGGTTT CAATGATCCA TTATTAATTC AATATAAAAA TTGGTTACTT 5 38 0 

GAAGCGATGC AATTTAATTT TGGTACAAGC TACATTACAG GTGACCCAGT TGCTGAACGT 5 94 0 

ATTGGTCCAG CATTTATGAA TACATTGAAA TTAACAATAA TTTCAAGTGT TATGGTGATG 6000 

30 ATTACATCAA TTATTTTAGG TGTAGTTAGT GCATTAAAAA GAGGAAAGTT CACTGATCGT 6 06 0 

GCGATACGTT CAGTGGCTTT CTTTCTAACT GCATTACCAT CAT A IT GG AT AGCTTCAATA 612 0 

CTTATTATTT ACGTTTCAGT GAAGTTAAAC ATATTGCCGA CTTCTGGATT AACAGGTCCA 6180 

35 

GAAAGTTACA TATTGCCAGT GATCGTTATT ACGATTGCCT ATGCTGGTAT TTACTTTAGA 624 0 

AATGTTAGAC GCTCGATGGT GGAACAATTA AATGAAGATT ATGTACTTTA TTTAAGAGCA 63 00 

AGCGGTGTGA AATCTATCAC ATTAATGTTG CATGTGTTGC GTAATGCTTT ACAAGTTGCG 63 60 

40 

GTATCAATCT TTTGTATGTC TATACCAATG ATAATGGGTG GACTAGTTGT TATCGAGTAT 64 20 

ATCTTTGCAT GGCCTGGACT AGGTCAATTA AGTTTAAAAG CAATACTTGA ACACGATTTT 64 30 

4S CCAGTCATTC AAGCATATGT ATTAATTGTA GCGGTATTAT TTATTGTATT TAATACATTA 6 54 0 

GCAGATATCA TTAATGCGCT ATTAAATCCA AGATTAAGGG aGGGCGCACG ATGATAATTT 5 5 00 

TAAAmCGATT ATTmCArGwT AAAGGTGCAG TAATTGCTTT AGGCATTATT GTATTATATG 66 5 0 

so TCTTTTTAGG ATTAGCAGCA CCAC7TGTGA CATTTTATGA TCCTAACCAT ATCGATACAG 6 72 0 

CAAACAAATT TGCTGGCATG AGTTTTCAAC ATCTACTAGG TACTGACCAT TTAGGTAGAG 6 780 
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TATTTGTTTC TGTACTTATT GGATCTATTT TAGGATTCTT ATCAGGATAT TTCCAA3GGT 6 90 0 

TTGTTGACGC CTTAATCATG CGTGCGTGTG ATGTTATGTT GGCATTCCCA AGTTATGTTG 6 96 0 

5 TAACGTTAGC ATTAATTGCA TTGTTTGGAA TGGGTGCCGA AAATAT7ATC ATGGCATTTA 7 020 

TTTTGACGCG TTGGG CATGG TTCTGTCGTG TTATACGTAC AAGTGTTATG CAGTACACTG 7 080 

CTTCTGACCA TGTAAGATTT GCTAAAACAA TCGGTATGAA TGATATGAAA ATTATTCACA 714 0 

10 

AACATATTAT GCCATTAACA TTAGCAGATA TTGCTATCAT CTCTAGTAGC TCGATGTGTT 72 00 

CAATGATCTT GCAAATATCT GGCTTTTCAT TTTTAGGATT AGGTGTCAAA GCGCCTACTG 7260 

CAGAGTGGGG CATGATGCTT AACGAaGCTA GAAAAGTGAT G TTT ACACAT CCTGAAATGA 7 3 20 

15 

TGTTTGCGCC AGGTATTGCC ATAGTGATTA TAGTGATGGC ATTTAACTTC TTATCCGATG 7 3 BO 

CTTTACAAAT TGCTATTGAT CCCCGCATCT CTTCTAAAGA TAAACTTCGT TCTGTGAAAA 7440 

2Q AAGGAGTGGT GCAATCATGA CATTGTTAAC AGTTAAACAT TTGACGATTA CAGATACCTG 7500 

GACAGATCAA CCACTCGTGA GTGATGTGAA TTTTACATTA ACTAAGGGTG AAaCTTTAGG 756 0 

CGTTATTGGA GAAAGTGGTA GTGGTAAATC AATCACTTGT AAAT CGATT A TTGGTTTGAA 76 20 

25 TCCCGAACGA CTCGGGGTGA CAGGTGAAAT TATCTTTGAT GGTACAtCAA TGTTGTCATT 76 8 0 

ATCTGAATCG CAATTGAAAA AGTACCGTGG TAAAGA CATT GCGATGGTCA TGCAACAAGG 774 0 

TAGTCGTGCC TTTGACCCAT CAACTACTGT CGGTAAACAA ATGTTTGAGA CTATGAAAGT 780 0 

30 ACATACGTCA ATGTCTACAC AAGAAATTGA AAAGACATTG ATTGAATATA TGGATTATTT 786 0 

AAGTTTGAAA GATCCTAAAC G TAT ATT AAA ATCATACCCT TACATGTTAT CAGGAGGAAT 7 92 0 

GTTACAGCGA TTGATGATTG CTTTAGCGTT AgcTTTg AAA CCAAAGTTAA TCATTGCTGA 7 980 

35 TGAGCCGACA ACGGCTTTAG ATACAATTAC ACAATATGAT GTACTGGAAG CATTTATAGA 8 04 0 

TATTAAAAAA CACTTTGACT GTGCGATGAT TTTCATTTCA CATGATTTAA CGGTTATTAA 8100 

CAAGATTGCA GACCGTGTTG TTGTGATGAA AAATGGTCAG CTTATTGAAC AAGGGACACG 8160 

40 

TGAATCAGTC TTGCATCATC CAGAACATGT TTATACGArt ATTkTATTAT CAACGAAGAA 8220 

GAAGATTAAT GATCATTTTA AACATGTGAT GAGGGGTGAT GTACATGATT AAAATTAAAG 8280 

ATGTTGAAAA GTCATATCAA AGCGCACATG TTTTTAAGCG TCGTCGAACA CCTATCGTGA 834 0 

45 

AAGGTGTGTC ATTTGAGTGT CCAATCGGTG CGACGATTGC GATTATCGGA GAAAGTGGTA 84 0 0 

GCGGTAAATC GACGTTGAGT CktATGATAT TAGGTATTGA GAAACCGGAT AAAGGTTGTG 84 6 0 

so TAACCTTAAA TGATCAACCG ATGCATAAGA AGAAAGTGAG ACGTCATCAA ATTGGTGCTG 8 52 0 

TATTTCAAGA TTATACGTCA TCATTACATC CATTTCAGAC TGTTAGAGAA AT CTT ATTTG 8 5 80 
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TGTTGGAAGA AGTCGGTCTA TCTAAGGCAT ACATGGATAA ATATCCTAAT ATGTTATCAG 8700 

GTGGAGAGGC GCAACGTGTT GCGATTGCGC GTGCAATATG TATTAACCCT AAATATATTT 876 0 

o TGTTTGATGA AGCCATTAGT TCACTCGACA TGTCAATTCA AACACAAATA TTAGATTTAT 8 82 0 

TGATTCATTT ACGTGAAACG CGTCAGTTGA GTTATATTTT TATCACACAT GATATTCAAG 8 88 0 

CTGCCACGTA TTTATG7GAT CAATTAATTA TTTTTAAAAA CGGAAAAATA GAAGAACAAA 8 94 0 

10 

TTCCGACAAG CGCATTGCAT AAAAGTGACA ATGCTTATAC AAGAGAATTA ATAGAAAAAC 9000 

AACTATCATT CTAAGGAGTG AGATAATGAA AGGTGCAATG GCTTGGCCCT TTTTGAGATT 90 6 0 

ATATATATTA ACATTGATGT TCTTTAGTGC CAATGCAATC TTAAACGTGT TTATACCTTT 912 0 

IS 

ACGAGGGCAT GATTTAGGCG CAACGAATAC GGTTATCGGT ATCGTTATGG GGGCATACAT 918 0 

GTTAACAGCA ATGGTATTTC GACCATGGGC AGGACAAATT ATTGCTCGTG TCGGTCCCAT 924 0 

20 TAAAGTATTA AGAATTATTT TGATTATCAA TGCCATAGCT TTAATTATTT ATGGTTTTAC 93 0 0 

TGGCTTAGAA GGTTATTTCG TAGCACGTGT TATGCAAGGT GTGTGTACGG CATTCTTTTC 93 6 0 

TATGTCTTTA CAGCTAGGTA TTATTGATGC ATTACCAGAG GAACATCGTT CTGAAGGTGT 94 2 0 

25 ATCATTGTAC TCGCTATTTT CAACGATTCC AAACTTAATC GGACCATTAG TTGCCGTAGG 94 8 0 

TATTTGGAAT GCAAATAATA TTTCACTATT TGCAATTGTC ATTATCTTTA TCGCATTAAC 9 54 0 

AACAACATTC TTTGsTATCG CGTGACCTTT GCTGAACAGG AACCCGATAC GTCAGATAAG 96 0 0 

30 ATTGAAAAAA TGCCGTTTAA CGCTGTAACT GTTTTTGCGC AATTTTTCAA AAATAAAGAG 96 6 0 

TTGTTGAACA GTGGTATTAT CATGATTGTT GCATCGATTG TATTTGGTGC AG TT AG TA C A 9720 

TTTGTACCGT TATACACAGT GAGTTTAGGA TTCGCGAATG CGGG AATCTT TTTGACAATA 97 80 

35 

CAGGCCATCG CAGTTGTTGC GGCAAGATTT TACTTAAGGA AAT ACATTC C GTCAGATGGT 9 84 0 

ATGTGGCATC CTAAATATAT GGTATCTGTA CTATCATTAT TAGTAATCGC GTCATTTGTA 99 00 

GTGGCATTTG GTCCGCAAGT AGGTGCAATT ATTTTCTATG GTAGTGCGAT ATTAATAGGA 9 9 60 

40 

ATGACGCAAG CAATGGTGTA CCCAACATTA ACATCATACT TAAGCTTCGT CTTACCAAAA 10020 

GTAGGTCGTA ATATGTTGTT AGG7TTATTT ATTGCCTGTG CAGACTTAGG TATATCGTTA 10030 

GGTGGCGCAT TGATGGGACC TATTTCCGAT TTAGTAGGAT TTAAATGGAT GTATCTAATT 1014 0 

■4b 

TGTGGTATGT TAGTCATTGT AATAATGATT ATGAGTTTCT TGAAAAAGCC AACACCACGT 10200 

CCAGCGAGTA GTCTTTAATG AAGTGAATTA AAGCATATTA AGTTAATGAA TATTTAAATT 10260 

SO TTAAAAGG T A TATTGaGCAT GGCGATTCAT GTGCTTCATG CTAGGACATG AAACATTCTA 10320 

TATGGCTCGT TTTTAGAACG AC A t AT AT CT AAATAAAGCA CGCTTArAAG TGAGTTTTGA 10380 
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TTACATGAAA 


AT ATG CAAAA 


CGAGTATAAC 


TGCTAATTGA 


TAGAAATAGC 


TCACCATAAA 


10500 




ATTACGGTAT 


GATTTTAAAT 


ATAAGTAAGT 


CGCACTACCT 


GCTAGTATCA 


ATGCTGGAAT 


10560 


5 


GAATTCCCAC 


CATGTATTAA 


TGTATGGATA 


GTAGAACAGA 


GTTTCAAGGA 


T AATGG A CAA 


10620 




TACTATTGTA 


AT CTTTAAAG 


GTATTAATCT 


GCTTAATTCT 


TGAATTAAAA 


TATGACGGAA 


10680 


10 


AATAAGTTGA 


CAAATCAAAG 


TATTTAATAT 


AATGGTTAAC 


GAAAATATAG 


CTATTAAACT 


1C740 


GATGGAaCCA 


TACCCTTTAA 


TGAGCGGGTA 


AATGTCAAAG 


ACAGTAAAGG 


AATCTACATT 


10800 




TAG7GCGAAA 


ATATTGAAAT 


GATTTAAAAG 


TAAAAAGAGT 


ACGACACTTA 


GTGTAAATGA 


10860 


15 


TATAAGAATA 


TGCCATTTAT 


ATTTAGCACT 


AGCAACGATT 


TGCGAACGTA 


TCATTGGAAT 


10920 




AAACGCATCT 


TCATG CATCA 


GACGAAAAAT 


AGCTAGTGAA 


ATAATAACTG 


CGAGTAAATA 


10980 




GCTAATGTTC 


ATTGAAATAG 


GAAAAGAGAA 


ACCCCACGGA 


GCTTGTTGAG 


TGAATACAGC 


11040 


20 


TACTAACCCA 


AAAGTTAAAA 


AGACGATAAT 


GATCGGCAAG 


ATGTTAACCA 


AAAAT ATG T A 


11100 




AAGGAAAATA 


AATCCAATAT 


CACGTTTGAA 


AAAACGCGAT 


TGTTCGGTAG 


CGTATTCTTC 


11160 




TTCTATGTAA 


TGTTTATTTG 


TATTTGACAT 


AGTATACCTC 


TTAAATAGTT 


GTATTATATA 


11220 


25 


GATACTTTAG 


CA CAT ATT AC 


TTTGTATTGT 


ATGTTTTATA 


CATTAAAATT 


TAAAATGAAA 


11280 




AA CAT AT CAT 


AAAATTGTTT 


TATAAAATGA 


AGCGCTTCCA 


TTGTGTTTTG 


TTTTGTAAGG 


11340 




TGTATCATAA 


ATATTGAATT 


GAAATTTTGG 


GGGGAGGTAT 


TGTAATG A CG 


TTTCTTACAG 


11400 


30 


TCATG CAATT 


TATAGTTAAC 


ATTATCGTTG 


TAGGATTCAT 


GCTTACGGTT 


ATTGTTATCG 


11460 




GGCTTATTTG 


GTTAATTAAA 


GATAAAAGAC 


AATCACAACA 


TAGTGTATTA 


AGGAATTATC 


11520 


35 


CTTTACTAGC 


ACGTATTAGA 


TATATTTCAG 


AAAAAATGGG 


ACCGGAATTA 


CGTCAGTATT 


11580 


TATTTTCTGG 


GGATAATGAA 


GGG AAACCTT 


TTTCACGTAA 


TGATTATAAA 


AATATCGTTT 


11640 




TGGCfcGAAA ATATAACTCT 


CGTATGACCA 


GCTTCGGTAC 


TACTAAAGAT 


TATCAAGACG 


11700 


40 


GCTTTTACAT 


ACAGAACACA 


ATGTTTCCGA 


TGCAACGTAA 


TGAGATTTCA 


G T AG AT AAT A 


11760 


CAACATTGTT 


ATCAACATTC 


ATTTATAAAA 


TCGCGAATGA 


GCGTTTATTT 


AGTCGTGAAG 


11820 




AATATCGTGT 


GCCGACAAAG 


ATTGATCCGT 


ATTACTTAAG 


TG ATGAC CAT 


GCAATAAAAT 


11890 


45 


TAGGTGAACA 


TTTAAAACAT 


C CATTT A TTT 


TAAAACTCTAT 




TPTHTTATP A 


1 1 Q A A 

X 1 y 4 U 




GTTATGGCGC 


TTTAGGAAAA 


AATGCCATTA 


CAGCTTTATC 


TAAAGGTCTA 


GCTAAAGCGG 


12000 




GCACTTGGAT 


GAATACAGGT 


GAAGGTGGCT 


TATCAGAATA 


TCATTTAAAA 


GGTAATGGGG 


12060 


50 


ATATCATTTT 


CCAAATTGGT 


CCCGGTTTAT 


TTGGTGTTCG 


TGATAAAGAA 


GGTAATTTTA 


12120 




GTGAAGGTTT 


ATTTAAAGAG 


GTTGCACAGT 


TATCTAACGT 


ACGCGCATTT 


GAGCTGAAGT 


12180 
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TTGCTAAAAT 


CCGAAATGTT 


GAACCTTATA 


AAACAATCAA 


TTCACCTAAC 


CGTTACGAAT 


12330 




TTATTCATAA 


TGCTGAAGAT 


TTGATTCGTT 


TCGTCGATCA 


GTTGCAGCAA 


TTAGGTCAAA 


12360 


5 


AACCAGTAGG 


ATTCAAAATT 


GTAGTAAGCA 


AAGTTTCAGA 


AATTG AAA CA 


CTTGTACGTA 


12420 




CGATGGTGGA 


ACTAGATAAG 


TATCCAAGCT 


TTATTACGAT 


TGATGGTGGT 


GAAGGTGGTA 


12480 


10 


CTGGTGCAAC 


ATTCCAAGAA 


TTACAAGATG 


GTGTTGGCTT 


ACCGCTATTT 


ACAGCTCTAC 


12540 


CTATTGTGTC 


TGGCATG1TA 


GAAAAATATG 


GTATTCGAGA 


TAAAGTGAAA 


TTGGCGGCAT 


12600 




CTGGTAAGTT 


AGTGACACCA 


GATAAAATTG 


CGATTGCACT 


AGGTTTAGGT 


GCAGATTTTG 


12660 


15 


TAAATATCGC 


ACGTGGGATG 


ATGATTAGTG 


TCGGTTGTAT 


AATGAGTCAA 


CAATGTCACA 


12720 




TGAATACGTG 


TCCTGTAGGT 


GTTGCAACGA 


CAGATGCGAA 


GAAAGAAAAA 


GCATTGATTG 


12780 




TTGGAGAAAA 


GCAATATCGT 


GTCACAAACT 


ATGTAACAAG 


TTTGCATGAA 


GGCTTATTCA 


12840 


20 


ATATTGCAGC 


AGCTGTTGGC 


GTATCCAGTC 


CTACAGAAAT 


TACTGCTGAT 


CATATTGTAT 


12900 




ATCGAAAAGT 


CGATGGTGAG 


TTACAAACGA 


TACATGATTA 


TAAATTAAAA 


CTCATTAGTT 


12960 




AACTTAATTA 


TTTCGGGAAA 


TTGAAAGCAG 


CGGATTTTAG 


CGTTACTGCA 


AATAATTTTA 


13020 


25 


TATTAGTAGT 


GGATGCTGGT 


CACACAAGAA 


CTTCAAATAT 


TAAAGCCCTC 


AGAATATGAA 


13080 




TTAAGGTTTG 


TAACCTTAGT 


CTTATCTGAG 


GG CATTTTTA 


AGTTATAAAC 


TATTTGTCGT 


13140 




CCATTTTATC 


TTTTTCTTTT 


AAACCTCTGT 


GCTTTAATTG 


CTTTTCAAGT 


TTTTCAAAAC 


13200 


30 


TAATATCTTT 


ATTTTCTTTA 


GTCGAAACAC 


CAAGACGTTT 


ATTTAATTTT 


TTCATGTCAA 


13260 




CTTCTGTGTA 


ATCTATGTCT 


AAGTGyTCAA TTGCTTTTTT ATCTTTATAG 


TCTACTTTGT 


13320 


35 


ATTTTACGCC 


TTTAAGGTCT 


TTGAAAATAC 


TTTCAGATTT 


GGCGAATAAC 


TTTn'GGCTT 


13380 


CGTCTTTATC 


CATACCTAGA 


TCGTCATATT 


TAATTGTGTT 


G ATTGT AG A C 


TGTTTTAAAA 


13440 




CTTTATCATC 


TTTATATGTG 


ATAGAAGTTA 


GTACATGTTT 


ACCACTAACA 


TCACCvTCAT 


13500 


40 


ATGTTTTGGT 


7TGTTCTTTA 


CCACAAGCTG 


ATAATGCAAT 


GATACAAACT 


AATG CTACT A 


13560 




CAATTAATGA 


ACATAATTTT 


TTCAAAGTCA 


GTCGCCTTCT 


TTCGATATTT 


GTATTATAAA 


13520 




GAAATTATAA 


CATTTACTAA 


AAAATGATGT 


TATTCAAAAA 


TTTAAATTTT 


GTCATTTTTT 


13680 


45 


TTGAAGATAT 


GAGTTTTTTT 


AAGCGGATTC 


CTCACAAAAT 


TTTAAAAATA 


1 i l/trtAj^L. 1 Is. 


i "i. ~t a n 




AAAATGATAA 


AGCGkTAGGG 


AACGTTTTTC 


TGAAAGTTAG 


TGATACAATA 


GTTTTAAGTT 


13800 




GAAATACAGG 


AGGATGAATA 


ACATGAATCA 


GTCAGTCAAA 


TTACTTAAAC 


ATTTAACAGA 


13860 


SO 


TGTAAACGGC 


ATTGCTGGTT 


ATGAAATGCA 


AGTTAAAGAA 


GCAATGCGTa 


ACTATATAGA 


13920 




GCCTGTCAGT 


GATCAAATTA 


TTGAAGATAA 


CTTGGGTGGC 


ATTTTTGGAA 


AGAAAAATGC 


13930 
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AACAAAGATT GATAAACATG GTTTTATTTC ATTTACGCCA kTgGTGGATG GTGGAATCAA 14 100 

GTCATGCTAT CTCAAAAAGT AACGATTACA ACAGATTCGG GCAAAGAAAT TAGAGGTATC 14160 

ATCGGTTCTA AACCGCCACA TGTCTTAACG CCTGAAGAAC GTAAAAAGCC AATGGAAATC 14 220 

AAAAATATGT TTATAGATAT TGGTGTTAGT AGCAAGGAAG AAGCTGAAGA AGCTGGCGTT 14 28 0 

GAAGTAGGCA ATATGGTTAC GCCATATAGT GAATTTGAAG TGCTTGCAAA TGATAAATAT 14 340 

TTAACTGCGA ArCATTTGAT AATCGCTATG GCTGTGCATT AGCTATTGAG GTATTAAAAC 144 00 

GTTTAAAAGA TGAAAATA7T GG CATTAACT TATACAGTGG TGCCACAGTG CAAGAAGAAG 14 4 60 

TTGGTTTGCG TGGTGCGAAA GTGGCAGCGA AT AC GATTAA ACCAGACTTG G CG AT AgcTG 14 520 

TcGATGTAGG TATTGCTTAT GAT AC C C CAG GTATGTCAGG TCAAACGAGC GATAGTAAAC 14 5B0 

TAGGCGGTGG TCCAGTTGTC ATTATGATGG ATG CTACAAG TATTGCTCAC CAAGGTTTGC 14 64 0 

20 GAAAgcATaT TAAAGATGTA GCTAAGGAAC ATAACATCGA AGTACAATGG GATACGACAC 14 700 

CAGGTGGAGG TACAGATGCG GGAAGTATTC ATGTCGCAAA TGAAGGTATT CCAACGATGA 1476 0 

CAATCGGTGT TACGCTGCGA TACATGCATT CTAATGTTTC AGTGCTCAAT GTAGATGATT 14820 

25 ATGAAAATTC TATCCGTCTT GTTACTGAAA TTGTCCGTTC ATTGAATGAT GAAAGTTATA 14 8 80 

AAAATATCAT GTGGTAATCA AATCCATAAA TAATAAAGAA TCCTTTTAAT ATGGTAGGTT 14 94 0 

GTTAAACAAT TGTCTAATTT TAATTCTTAG TCATTAG A CA GTATCCATGT TAATAGGATT 15 000 

30 

TTTTGTTTTT AATTTAAATG CTGAAAATCA ATTATGCCTA AATTTTG A T A TTACAAGAAA 15 060 

ATGATTTTTT CTTAAATGTA ATTGCACTAA AAACCAAAAA AACGGGAATA AT AT ACCTG A 15120 

TATATTACAT GAGGAGCGGT GCAAATGTTG TTAGAAATTA AAGATTTAGT GTATAAAGCG 15180 

AGCGATAGAA TCATACTAGA TCATATCAGT CTAAAAGTAG ATAAAGGCGA GAGTATTGCC 1524 0 

ATT AT AGG T C CATCAGGTAG TGGTAAAAGT ACATTTCAAA AGCAAATATG TAATTTGTTT 153 00 

AGTCCAACTA GTGGAGAACT TTATTTTAAA GGTAAACCCT ATAATGATTA TGACCCGGAA 1536 0 

GAATTGCGTC AACGAATCAG TTATTTGATG CAGCAAAGTG ACTTGTTTGG TGAAACGATT 15420 

GAAGATAACA TGATATTCCC ATCACTTGCA CGTAATGATA AATTTGATAG AAAACGTGCA 15480 

45 AAGCAATTAA TTAAAGATGT CGGTTTGGGA CATTATCAAT TAAGTTCGGA AGTGGAAAAT 15 540 

ATGTCGGGTG GTGAGCGGCA AAGAATTGCT ATAGCGCGCC AACTGATGTA TACACCGGAT 15600 

ATT CTTTT AT TAGATGAATC GACCAGTGCA TTAGACGTTA ATAATAAAGA AAAGATAGAA 1566 0 

50 AATATCATTT TTAAATTAGC AG AT CAAGG C GTGGCAATTA TGTGGATTAC CCACAGCGAT 15 720 

GACCAAAGTA TGCGACACTT TCAAAAGCGT ATAACAATTG TTGATGGTCA AATTTCTAAT 15780 



35 



40 



500 



